pax_global_header00006660000000000000000000000064136147011300014506gustar00rootroot0000000000000052 comment=3df4848f689bcd4a9f88e17785b6d3e4dc640c01 mescc-tools-Release_0.7.0/000077500000000000000000000000001361470113000154025ustar00rootroot00000000000000mescc-tools-Release_0.7.0/.gitignore000066400000000000000000000022121361470113000173670ustar00rootroot00000000000000## Copyright (C) 2017 Jeremiah Orians ## This file is part of mescc-tools. ## ## mescc-tools is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 3 of the License, or ## (at your option) any later version. ## ## mescc-tools is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with mescc-tools. If not, see . # Ignore build directory bin/ # Ignore Temp directory temp/ # Ignore test result directory test/results/ # Ignore temp and special files *.tst *.qst # Ignore bootstrap build files M1-macro-footer.M1 M1-macro.M1 M1-macro.hex2 blood-elf-footer.M1 blood-elf.M1 blood-elf.hex2 hex2_linker-footer.M1 hex2_linker.M1 hex2_linker.hex2 get_machine-footer.M1 get_machine.M1 get_machine.hex2 exec_enable-footer.M1 exec_enable.M1 exec_enable.hex2 mescc-tools-Release_0.7.0/CHANGELOG.org000066400000000000000000000254521361470113000174120ustar00rootroot00000000000000## Copyright (C) 2017 Jeremiah Orians ## This file is part of mescc-tools ## ## mescc-tools is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 3 of the License, or ## (at your option) any later version. ## ## mescc-tools is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with mescc-tools. If not, see . * Current ** Added Added support for AMD64 dwarf footers in blood-elf via --64 Added hex0 for i386 in NASM, M1 and hex0 Added hex1 for i386 in NASM, M1, hex1 and hex0 Added first generation AARCH64 elf header Added hex2 for i386 in NASM, M1, hex2 and hex1 Added M0 for i386 in NASM, M1 and hex2 Added catm for i386 in NASM, M1 and hex0 Added support for EOF in line comments in hex2 and M1; thanks to markjenkins Added prototype M1 Manpage Added prototype hex2 Manpage Added prototype blood-elf Manpage Added prototype kaem Manpage Added prototype get_machine Manpage Added cc_x86 for AMD64 in NASM and M1 Added cc_x86 for x86 in NASM and M1 Added cc_amd64 for AMD64 in NASM and M1 Added cc_amd64 for x86 in NASM and M1 ** Changed ** Fixed Removed duplicate in kaem's help Fixed regression in M1 in regards to knight null padding ** Removed * 0.6 - 2019-04-14 ** Added Added template ELF headers for ARM Added initial support for ARM Added official hex0 seed for AMD64 Added official hex1 seed for AMD64 Added support for Added catm NASM prototype to simplify build Added catm M1 prototype to reduce bootstrap dependency Added catm hex0 prototype to eliminate bootstrap dependencies down to hex0 Added M0 NASM prototype to simplify build Added M0 M1 prototype to reduce bootstrap dependency Added M0 hex2 prototype to eliminate bootstrap dependencies down to hex2 Verified ARM port to support M2-Planet ** Changed Updated build.sh and kaem.run to the current mescc-tools syntax Reduced get_machine's build dependencies Cleaned up x86 elf headers Removed kaem's dependence on getopt Replaced --Architecture with --architecture changed get_machine's default output to filter machine names into known families Reduced M1 null padding of strings to a single null for all architectures except Knight Updated AMD64 bootstrap kaem.run to include steps from hex0 to M0 ** Fixed Fixed broken test9 thanks to janneke Fixed wrong displacement calculations for ARM immediates Fixed typo in license header Fixed kaem.run to actually function and produce identical results Fixed regression caused by linux 4.17 Removed false newline added in numerate_number for zero case Fixed broken bootstrap script ** Removed Removed final dependency on getopt Removed need to know architecture numbers as that was a bad idea * 0.5 - 2018-06-15 ** Added Added INSTALL notes Added HACKING notes Added examples of minimal Hex1, Hex2 and M1-macro programs that may need to be written to bootstrap a particular architecture. Added useful functions to reduce bootstrap dependencies Added support for binary output in M1-macro ** Changed Changed Knight architecture offset calculation to match new standard Updated test3 lisp.s to include more functionality Updated test3 definitions file to reflect changes in Knight instruction encoding enhanced README to be more useful Pulled numerate_string functionality out of hex2 and M1 into a shared library Eliminated getopt from M1-Macro, hex2-linker and blood-elf; use --Architecture 1 instead of --Architecture=1 ** Fixed Corrected M1-macro incorrectly expressing negative numbers Updated test3 checksum to reflect new version of lisp.s fixed check.sh to actually perform all checks. Fixed build.sh to function in a self-hosting fashion ** Removed Removed blood-elf's dependency on getopt Removed C preprocessor macro from blood-elf needed for mescc support Removed hex2's dependency on getopt Removed C preprocessor macro from hex2 needed for mescc support Removed need for octal support in the building of hex2 Removed M1's dependency on getopt Removed C preprocessor macro from M1 needed for mescc support Removed need for sprintf from M1 * 0.4 - 2018-02-24 ** Added Added file checks to reduce the number of error messageless faults Added a current generation M1.M1 file as a test for mescc-tools Added prototype kaem build script M1-macro now catches undefined macros to allow easier troubleshooting Added kaem build tool Added ability to track build progress in kaem Added support for line escapes in kaem Added support for --strict in kaem to halt in the event of errors Added selectable script file support in kaem Added support for PATH search to kaem with fallbacks in the event of NULL environments ** Changed flipped blood-elf from ignoring :: to :_ converted test8 into a full test Added bash style line comments to kaem Added support for raw strings to kaem Stopped showing comment lines in kaem --verbose Removed dependence on getenv to have more control over environmental lookup ** Fixed Fixed stack overflow bug caused by too deeply nested recursion by transforming into iteration Fixed default repo to point to current repo Added missing license header to kaem.c Fixed infinite looping in kaem scripts that hit an error that resets the file descriptor ** Removed Removed need for strtol Removed need for a global variable in M1-Macro Removed legacy functions from kaem * 0.3 - 2017-12-01 ** Added Incorporated a hex0 test which implements hex1 functionality Added --output and --exec_enable options to hex2 Added --output option to M1 Wrote Hex1 in Hex0 for AMD64/ELF Added the ability to specify an output file Added exec_enable to allow the arbitrary setting of executable bits Added get_machine to enable better scripting Incorporated janneke's build scripts Added a test to test for unusual nybble and byte order/formatting issues Added blood-elf to generate elf footer capable of being used by objdump ** Changed Renamed MESCC_Tools to mescc-tools to harmonize with guix package name Now all tests will be architecture specific Modified sprintf to behave correctly for negative numbers Converted blood-elf to read M1-macro input and output M1-macro output replaced uint with unsigned to better match the standard Harmonized MAXSTRING to 4096bytess ** Fixed Incorporated janneke's patchs to fix mescc compatibility Fixed test on ARM platforms Fixed range check to behave correctly with unsigned ints ** Removed Removed the need to redirect hex2 output into a file Removed the need for chmod u+x in development paths Removed the need to redirect M1 output into a file Removed the need for chmod entirely from bootstrap path Removed dependency on shell supporting redirects Removed need for stdint and stdbool Removed need for enum support Removed need for strtol in M1-macro * 0.2 - 2017-07-25 ** Added created test2 (a 32bit x86 hex assembler) with its associated build and test changes Fixed proof answers for test1 and test2 Added support to M0 for multiple architectures Added range checking into M0 to make sure immediates will fit into specified space Added a basic tutorial for generating new M0 definitions Created a M1 compatible version of test0 Added an amd64 program for enabling execute bits (might need to later alter the 0777) Added an i386 program for enabling execute bits (might need to later alter the 0777) Added rain1's improvements to gcc flags Added rain1's stack reduction recommendations Incorporated an AMD64/elf hex1 example program as a test Incorporated Test7 into make test and make clean flows ** Changed Adjusted tags to reflect current CHANGELOG Make test now depends upon test2 completing Changed how M0 processes input to reduce stack usage and improve performance Renamed M0 to M1 to reflect the additional functionality it provides Applied Janneke's patch for accepting hex numerics in M1 Refactored x86/amd64 elf headers to a standard to avoid duplication Standardized C flags for compiling M1 and Hex2 Made eval_immediates iterative instead of recursive Made identify_macros iterative instead of recursive Made process_string iterative instead of recursive Made preserve_other iterative instead of recursive Made print_hex iterative instead of recursive Incremented version numbers for hex2 and M1 Updated guix.scm to match the new version and finish the release Converted guix.scm definition for mescc_tools to use uri method instead of git ** Fixed Removed unrequired temp file in test1 Clarified meaning of Label>base displacement conditional Corrected error in test0 elf32 Test1 and Test2 to reflect the fact that /bin/bash doesn't exist in guix Fixed M0 regression to continue to support original test code Corrected makefile and build scripts to reflect rename Modified test make scripts to reflect new standard elf headers Fixed base address needed by test5 and its associated checksum Harmonized flags for displaying version with standard ** Removed Removed bashisms from Test1 and Test2 to allow proper behavior on debian based systems Removed alerting on missing files in cleanup target Removed massive M0 Definition lists as they don't serve a useful purpose * 0.1 - 2017-06-25 ** Added Incorporated support for little Endian output format in hex2 Incorporated support for multiple input files in hex2 Added range checking for Hex2 Added support for 1 and 4 byte relative displacements Added Hex2 Test Added the ability to specify a new base address Added example M0 x86 opcode definitions Incorporated support for multiple input files in M0 Added support for little Endian immediate output in M0 Added Hex assembler example test Added support for Label>base in Hex2 Added Version info Added install target Added inital guix package definition ** Changed Displacement calculations are now based on architecture specific rules M0 Immediates now need prefixes to specify the storage space to use for the immediate ** Fixed Behavior regarding !label displacements ** Removed * 0.0 - 2017-05-10 Initial release of MESCC Tools from stage0 High Level prototypes mescc-tools-Release_0.7.0/Generating_M0_Definitions.org000066400000000000000000000255261361470113000230770ustar00rootroot00000000000000* License ## Copyright (C) 2017 Jeremiah Orians ## This file is part of mescc-tools. ## ## mescc-tools is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 3 of the License, or ## (at your option) any later version. ## ## mescc-tools is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with mescc-tools. If not, see . * How to use software to generated opcode information Lets start with a simple program you wish to convert to M1, so to start we are going to write a hex disassembler that uses a lookup table ** simple assembly example .text # section declaration output: .byte 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x0A # we must export the entry point to the ELF linker or loader. # They convientionally recognize _start as their entry point. # Use ld -e main to override the default if you wish .global _start print_byte: # Write what ever is in eax mov $1, %edx # set the size of chars we want mov %eax, %ecx # What we are writing mov $1, %ebx # Stdout File Descriptor mov $4, %eax # the syscall number for write int $0x80 # call the Kernel ret read_byte: # Attempt to read a single byte from STDIN mov $1, %edx # set the size of chars we want mov $input, %ecx # Where to put it mov $0, %ebx # Where are we reading from mov $3, %eax # the syscall number for read int $0x80 # call the Kernel # If we didn't read any bytes jump to Done test %eax, %eax # check what we got jz Done # Got EOF call it done # Move our byte into registers for processing movb input, %al # load char movzx %al, %eax # move char into eax ret _start: loop: call read_byte mov %eax, %esi # We have to zero extend it to use it mov %eax, %ebp # We have to zero extend it to use it # Break out the nibbles shr $4, %esi # Purge the bottom 4 bits and $0xF, %ebp # Chop off all but the bottom 4 bits # add our base pointer add $output, %esi # Use that as our index into our array add $output, %ebp # Use that as our index into our array # Print our first Hex mov %esi, %eax # What we are writing call print_byte # Print our second Hex mov %ebp, %eax # What we are writing call print_byte jmp loop Done: # program completed Successfully mov $0, %ebx # All is well mov $1, %eax # put the exit syscall number in eax int $0x80 # Call it a good day .data write_size = 2 input: .byte write_size ** Build it To build the above program (assuming you put it in a file named foo.S and wish to produce a program called foo): as -o foo.o foo.S ld -o foo foo.S ** objdump its secrets If one were to run: objdump -d foo > opcodes.note *** Which would contain Disassembly of section .text: 08048074 : 8048074: 30 31 xor %dh,(%ecx) 8048076: 32 33 xor (%ebx),%dh 8048078: 34 35 xor $0x35,%al 804807a: 36 37 ss aaa 804807c: 38 39 cmp %bh,(%ecx) 804807e: 41 inc %ecx 804807f: 42 inc %edx 8048080: 43 inc %ebx 8048081: 44 inc %esp 8048082: 45 inc %ebp 8048083: 46 inc %esi 8048084: 0a .byte 0xa 08048085 : 8048085: ba 01 00 00 00 mov $0x1,%edx 804808a: 89 c1 mov %eax,%ecx 804808c: bb 01 00 00 00 mov $0x1,%ebx 8048091: b8 04 00 00 00 mov $0x4,%eax 8048096: cd 80 int $0x80 8048098: c3 ret 08048099 : 8048099: ba 01 00 00 00 mov $0x1,%edx 804809e: b9 f3 90 04 08 mov $0x80490f3,%ecx 80480a3: bb 00 00 00 00 mov $0x0,%ebx 80480a8: b8 03 00 00 00 mov $0x3,%eax 80480ad: cd 80 int $0x80 80480af: 85 c0 test %eax,%eax 80480b1: 74 34 je 80480e7 80480b3: a0 f3 90 04 08 mov 0x80490f3,%al 80480b8: 0f b6 c0 movzbl %al,%eax 80480bb: c3 ret 080480bc <_start>: 80480bc: e8 d8 ff ff ff call 8048099 80480c1: 89 c6 mov %eax,%esi 80480c3: 89 c5 mov %eax,%ebp 80480c5: c1 ee 04 shr $0x4,%esi 80480c8: 83 e5 0f and $0xf,%ebp 80480cb: 81 c6 74 80 04 08 add $0x8048074,%esi 80480d1: 81 c5 74 80 04 08 add $0x8048074,%ebp 80480d7: 89 f0 mov %esi,%eax 80480d9: e8 a7 ff ff ff call 8048085 80480de: 89 e8 mov %ebp,%eax 80480e0: e8 a0 ff ff ff call 8048085 80480e5: eb d5 jmp 80480bc <_start> 080480e7 : 80480e7: bb 00 00 00 00 mov $0x0,%ebx 80480ec: b8 01 00 00 00 mov $0x1,%eax 80480f1: cd 80 int $0x80 ** making sense of the objdump information *** Labels When you see 08048074 : It indicates that at address 0x08048074 the definition of the function output resides. *** 1OP Instructions When you see 8048098: c3 ret It indicates that at address 0x8048098 there is a Return instruction which as the Hex opcode encoding of C3 and could be implemented in M1 as: DEFINE RETURN C3 Or any other mnemonic term that is more optimal for the problem at hand. *** 2OP Instructions When you see 80480c1: 89 c6 mov %eax,%esi It indicate that at address 0x80480C1 there is a Move instruction that copies the value of register eax to register esi and has the Hex opcode encoding of 89C6 and therefor can be defined in M1 as: DEFINE COPY_EAX_To_ESI 89C6 or If we assume (eax=>R0, ebx=>R1, ecx=>R2, edx=>R3, esi=>R4, edi=>R5, ebp=>R6, and esp=>R7) DEFINE COPY_R0_To_R4 89C6 *** Instructions with Immediates or displacements Immediates occur in variable sizes but an immediate can not exist without an instruction **** Trivial example Most immediates are common values such as 1 (01) or -1 (FF..FF) that are immediately obvious: 80480a8: b8 03 00 00 00 mov $0x3,%eax 8048091: b8 04 00 00 00 mov $0x4,%eax 80480ec: b8 01 00 00 00 mov $0x1,%eax Espcially when there is a very familiar pattern and leading (or in x86's case trailing zeros) It should be immediately obvious that B8 is the opcode for loading a 32bit immediate value into eax, which can be written in M1 as: DEFINE MOV_Immediate32_EAX B8 or DEFINE LOADI32_R0 B8 You only need to remember to follow that mnemonic with a 32bit immediate (%4 works) **** Easy example For some immediate instructions the size and placement of the immediate is obvious (or perhaps obvious once you realize the Endianness of the instruction set you are working with) For example: 80480b3: a0 f3 90 04 08 mov 0x80490f3,%al Knowing that x86 is little endian, the 08 should pop out at you. f3 90 04 08 is the little endian encoding of the number 0x080490F3 and thus we know that the opcode is A0 and it requires a 32bit value (An absolute address) and that it writes that result to al (which is the bottom 8bits of the eax register) Thus we can express this opcode as: DEFINE MOV_Absolute32_al A0 or LOAD8_R0_Absolute32 A0 Which then always has to be followed by a 32bit absolute address ($foo works) **** Annoying example For some instructions, you may have to lookup the opcode to determine its length and thus the length of its immediate such as: 80480b1: 74 34 je 80480e7 Which when confronted with such a case, simply lookup the 74 in http://ref.x86asm.net/coder32.html thus resolving to it is both jz and je and it takes a 8bit relative address (the 34). Thus we can define our newly determined opcode in M1 as: DEFINE JE8 74 DEFINE JZ8 74 or DEFINE Jump_if_Zero8 74 but we need to make sure that whenever we use our mnemonic we follow it with a 8bit relative value (!label works well) *** Things objdump gets wrong The thing all disassemblers tend to get wrong and dependes entirely on heuristics is the identification of strings and byte constants. In our case, it has identified our table as a set of instructions (also correctly determined their representation) 08048074 : 8048074: 30 31 xor %dh,(%ecx) 8048076: 32 33 xor (%ebx),%dh 8048078: 34 35 xor $0x35,%al 804807a: 36 37 ss aaa 804807c: 38 39 cmp %bh,(%ecx) 804807e: 41 inc %ecx 804807f: 42 inc %edx 8048080: 43 inc %ebx 8048081: 44 inc %esp 8048082: 45 inc %ebp 8048083: 46 inc %esi 8048084: 0a .byte 0xa In M1 we have the ability to do things like strings to store such a table. Which would probably be the following: :output "0123456789ABCDEF" Which certainly alot easier to read and understand than output: .byte 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x0A * Using that opcode information to write a M1 program Thus we would come to a defintion list that looks something like this: DEFINE MOVZBL_al_To_eax 0FB6C0 DEFINE JE8 74 DEFINE ADD_Immediate32_To_ebp 81C5 DEFINE ADD_Immediate32_To_esi 81C6 DEFINE ANDI8_ebp 83E5 DEFINE TEST_eax_eax 85C0 DEFINE MOV_eax_To_ecx 89C1 DEFINE MOV_eax_To_ebp 89C5 DEFINE MOV_eax_To_esi 89C6 DEFINE MOV_ebp_To_eax 89E8 DEFINE MOV_esi_To_eax 89F0 DEFINE LOAD8_al A0 DEFINE LOADI32_eax B8 DEFINE LOADI32_ecx B9 DEFINE LOADI32_edx BA DEFINE LOADI32_ebx BB DEFINE SHIFT_RIGHT_Immediate8_esi C1EE DEFINE RETURN C3 DEFINE INT_80 CD80 DEFINE CALLI32 E8 DEFINE JUMP8 EB ** emacs tips Using the objdump output, first clear the labels and not instruction data. Then leverage C-x ( and C-x ) to define a keyboard macro that deletes the address from the start of the line. C-x e followed by e repeatedly to clear all of the lines. M-x sort-lines, will sort all selected lines (very useful as now all instructions with the same opcode are next to each other for easy pruning) M-x delete-duplicate-lines will purge all exact duplicates (very handy for compiler output) Then all that remains is determining the immediates and figuring out what line actually does. This is left as an exercise for the reader. mescc-tools-Release_0.7.0/HACKING000066400000000000000000000053121361470113000163720ustar00rootroot00000000000000-*-mode:org-*- ## Copyright (C) 2017 Jeremiah Orians ## This file is part of mescc-tools. ## ## mescc-tools is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 3 of the License, or ## (at your option) any later version. ## ## mescc-tools is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with mescc-tools. If not, see . mescc-tools being based on the goal of absolute minimal platform bootstrap. If your hardware doesn't have some crazy engineering decisions, mescc-tools can and likely will be trivially ported to it. * SETUP The most obvious way to setup for mescc-tools development is to install any C compiler and make clone of your choice. * BUILD The standard C based approach to building mescc-tools is simply running: make Should you wish to verify that mescc-tools was built correctly run: make test * ADDING an ARCHITECTURE The process is simple: 1) add an architecture flag for your architecture to hex2_linker.c 2) Then make sure byte and then bit order are correct 3) Tweak to make sure immediate prefixes are the correct size for your architecture 4) Then make sure relative displacements are calculated correctly 5) Then make sure absolute displacements are calculated correctly 6) add an architecture flag for your architecture to M1-Macro.c 7) Then make sure byte and then bit order are correct 8) Tweak to make sure immediate prefixes are the correct size for your architecture 9) If you require unusual string padding, please add that now 10) Write your architecture.def or architecture.M1 file to include instruction and register encodings that map to the required encoding. * ROADMAP The current outstanding work for mescc-tools is several architecture specific bootstrap ports, that unfortunately share C level code but require significant manual labor to implement. * DEBUG The default build process will generate debuggable binaries. However the output for build.sh will not be debuggable due to bootstrapping constraints. * Bugs mescc-tools is the most unforgiving assembly development environment possible. Things such as manual padding requirements, arbitrary instruction encoding and other features of these tools, make for rapid bootstrapping but horrific development environments. Please only use these tools to bootstrap your system from zero; otherwise cross- compile with gcc and save yourself the pain. mescc-tools-Release_0.7.0/INSTALL000066400000000000000000000027131361470113000164360ustar00rootroot00000000000000## Copyright (C) 2017 Jeremiah Orians ## This file is part of mescc-tools. ## ## mescc-tools is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 3 of the License, or ## (at your option) any later version. ## ## mescc-tools is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with mescc-tools. If not, see . Building and Installing mescc-tools * Get it git clone https://github.com/oriansj/mescc-tools.git * Prerequisites ** Bootstrapping mescc-tools can be bootstrapped from a simple hex assembler. Once hex0 (in test1) is built, you'll be able to build hex1 exec_enable (test6) will also need to be built on systems that require an execute bit to be set on binaries before they can be run. Once hex1 is built, hex2-linker can be built with it. Once hex2-linker is built, M1-Macro can be built with it. Then everything else can be built with M1-Macro and hex2 ** Development The tools required for easier development include binutils, gcc and make * Build it make or ./build.sh * Check it make test or ./check.sh * Install it make install or ./install.sh mescc-tools-Release_0.7.0/M1-macro.c000066400000000000000000000361111361470113000171240ustar00rootroot00000000000000/* -*- c-file-style: "linux";indent-tabs-mode:t -*- */ /* Copyright (C) 2016 Jeremiah Orians * Copyright (C) 2017 Jan Nieuwenhuizen * This file is part of mescc-tools. * * mescc-tools is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * mescc-tools is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with mescc-tools. If not, see . */ #include #include #include #include //CONSTANT max_string 4096 #define max_string 4096 //CONSTANT MACRO 1 #define MACRO 1 //CONSTANT STR 2 #define STR 2 //CONSTANT NEWLINE 3 #define NEWLINE 3 //CONSTANT TRUE 1 #define TRUE 1 //CONSTANT FALSE 0 #define FALSE 0 // CONSTANT KNIGHT 0 #define KNIGHT 0 // CONSTANT X86 1 #define X86 1 // CONSTANT AMD64 2 #define AMD64 2 // CONSTANT ARMV7L 40 #define ARMV7L 40 // CONSTANT AARM64 80 #define AARM64 80 /* Imported functions */ char* numerate_number(int a); int hex2char(int c); int in_set(int c, char* s); int match(char* a, char* b); int numerate_string(char *a); int string_length(char* a); void file_print(char* s, FILE* f); void require(int bool, char* error); /* Globals */ FILE* source_file; FILE* destination_file; int BigEndian; int BigBitEndian; int ByteMode; int Architecture; int linenumber; void line_error(char* filename, int linenumber) { file_print(filename, stderr); file_print(":", stderr); file_print(numerate_number(linenumber), stderr); file_print(" :", stderr); } struct Token { struct Token* next; int type; char* Text; char* Expression; char* filename; int linenumber; }; struct Token* newToken(char* filename, int linenumber) { struct Token* p; p = calloc (1, sizeof (struct Token)); require(NULL != p, "Exhusted available memory\n"); p->filename = filename; p->linenumber = linenumber; return p; } struct Token* reverse_list(struct Token* head) { struct Token* root = NULL; while(NULL != head) { struct Token* next = head->next; head->next = root; root = head; head = next; } return root; } void purge_lineComment() { int c = fgetc(source_file); while(!in_set(c, "\n\r")) { if(EOF == c) break; c = fgetc(source_file); } } struct Token* append_newline(struct Token* head, char* filename) { linenumber = linenumber + 1; if(NULL == head) return NULL; if(NEWLINE == head->type) {/* Don't waste whitespace*/ return head; } struct Token* lf = newToken(filename, linenumber); lf->type = NEWLINE; lf->next = head; lf->Text = "\n"; lf->Expression = lf->Text; return lf; } struct Token* store_atom(struct Token* head, char c, char* filename) { char* store = calloc(max_string + 1, sizeof(char)); require(NULL != store, "Exhusted available memory\n"); int ch = c; int i = 0; do { store[i] = ch; ch = fgetc(source_file); i = i + 1; require(i < max_string, "storing atom of size larger than max_string\n"); if(EOF == ch) break; } while (!in_set(ch, "\t\n ")); head->Text = store; if('\n' == ch) { return append_newline(head, filename); } return head; } char* store_string(char c, char* filename) { char* store = calloc(max_string + 1, sizeof(char)); require(NULL != store, "Exhusted available memory\n"); int ch = c; int i = 0; do { store[i] = ch; i = i + 1; if('\n' == ch) linenumber = linenumber + 1; ch = fgetc(source_file); require(EOF != ch, "Unmatched \"!\n"); if(max_string == i) { line_error(filename, linenumber); file_print("String: ", stderr); file_print(store, stderr); file_print(" exceeds max string size\n", stderr); exit(EXIT_FAILURE); } } while(ch != c); return store; } struct Token* Tokenize_Line(struct Token* head, char* filename) { int c; struct Token* p; linenumber = 1; do { restart: c = fgetc(source_file); if(in_set(c, ";#")) { purge_lineComment(); head = append_newline(head, filename); goto restart; } if(in_set(c, "\t ")) { goto restart; } if('\n' == c) { head = append_newline(head, filename); goto restart; } if(EOF == c) { head = append_newline(head, filename); goto done; } p = newToken(filename, linenumber); p->next = head; if(in_set(c, "'\"")) { p->Text = store_string(c, filename); p->type = STR; } else { p = store_atom(p, c, filename); } head = p; } while(TRUE); done: return head; } void setExpression(struct Token* p, char *c, char *Exp) { struct Token* i; for(i = p; NULL != i; i = i->next) { /* Leave macros alone */ if(MACRO == i->type) { if(match(i->Text, c)) { line_error(i->filename, i->linenumber); file_print("Multiple definitions for macro ", stderr); file_print(c, stderr); file_print("\n", stderr); exit(EXIT_FAILURE); } continue; } else if(match(i->Text, c)) { /* Only if there is an exact match replace */ i->Expression = Exp; } } } void identify_macros(struct Token* p) { struct Token* i; for(i = p; NULL != i; i = i->next) { if(match(i->Text, "DEFINE")) { i->type = MACRO; require(NULL != i->next, "Macro name must exist\n"); i->Text = i->next->Text; require(NULL != i->next->next, "Macro value must exist\n"); if(STR == i->next->next->type) { i->Expression = i->next->next->Text + 1; } else { i->Expression = i->next->next->Text; } i->next = i->next->next->next; } } } void line_macro(struct Token* p) { struct Token* i; for(i = p; NULL != i; i = i->next) { if(MACRO == i->type) { setExpression(i->next, i->Text, i->Expression); } } } void hexify_string(struct Token* p) { char* table = "0123456789ABCDEF"; int i = string_length(p->Text); char* d = calloc(((((i >> 2) + 1) << 3) + 1), sizeof(char)); require(NULL != d, "Exhusted available memory\n"); p->Expression = d; char* S = p->Text; if(KNIGHT == Architecture) { i = ((((i - 1) >> 2) + 1) << 3); while( 0 < i) { i = i - 1; d[i] = '0'; } } while( 0 != S[0]) { S = S + 1; d[0] = table[S[0] >> 4]; d[1] = table[S[0] & 0xF]; d = d + 2; } } void process_string(struct Token* p) { struct Token* i; for(i = p; NULL != i; i = i->next) { if(STR == i->type) { if('\'' == i->Text[0]) { i->Expression = i->Text + 1; } else if('"' == i->Text[0]) { hexify_string(i); } } } } char* pad_nulls(int size, char* nil) { if(0 == size) return nil; require(size > 0, "negative null padding not possible\n"); size = size * 2; char* s = calloc(size + 1, sizeof(char)); require(NULL != s, "Exhusted available memory\n"); int i = 0; while(i < size) { s[i] = '0'; i = i + 1; } return s; } void preserve_other(struct Token* p) { struct Token* i; for(i = p; NULL != i; i = i->next) { if((NULL == i->Expression) && !(i->type & MACRO)) { char c = i->Text[0]; if(in_set(c, "!@$~%&:^")) { i->Expression = i->Text; } else if('<' == c) { i->Expression = pad_nulls(numerate_string(i->Text + 1), i->Text); } else { line_error(i->filename, i->linenumber); file_print("Received invalid other; ", stderr); file_print(i->Text, stderr); file_print("\n", stderr); exit(EXIT_FAILURE); } } } } void bound_values(int displacement, int number_of_bytes, int low, int high) { if((high < displacement) || (displacement < low)) { file_print("A displacement of ", stderr); file_print(numerate_number(displacement), stderr); file_print(" does not fit in ", stderr); file_print(numerate_number(number_of_bytes), stderr); file_print(" bytes\n", stderr); exit(EXIT_FAILURE); } } void range_check(int displacement, int number_of_bytes) { if(4 == number_of_bytes) return; else if(3 == number_of_bytes) { bound_values(displacement, number_of_bytes, -8388608, 16777216); return; } else if(2 == number_of_bytes) { bound_values(displacement, number_of_bytes, -32768, 65535); return; } else if(1 == number_of_bytes) { bound_values(displacement, number_of_bytes, -128, 255); return; } file_print("Received an invalid number of bytes in range_check\n", stderr); exit(EXIT_FAILURE); } void reverseBitOrder(char* c) { if(NULL == c) return; if(0 == c[1]) return; int hold = c[0]; if(16 == ByteMode) { c[0] = c[1]; c[1] = hold; reverseBitOrder(c+2); } else if(8 == ByteMode) { c[0] = c[2]; c[2] = hold; reverseBitOrder(c+3); } else if(2 == ByteMode) { c[0] = c[7]; c[7] = hold; hold = c[1]; c[1] = c[6]; c[6] = hold; hold = c[2]; c[2] = c[5]; c[5] = hold; hold = c[3]; c[3] = c[4]; c[4] = hold; reverseBitOrder(c+8); } } void LittleEndian(char* start) { char* end = start; char* c = start; while(0 != end[0]) end = end + 1; int hold; for(end = end - 1; start < end; start = start + 1) { hold = start[0]; start[0] = end[0]; end[0] = hold; end = end - 1; } if(BigBitEndian) reverseBitOrder(c); } int stringify(char* s, int digits, int divisor, int value, int shift) { int i = value; if(digits > 1) { i = stringify(s+1, (digits - 1), divisor, value, shift); } s[0] = hex2char(i & (divisor - 1)); return (i >> shift); } char* express_number(int value, char c) { char* ch = calloc(42, sizeof(char)); require(NULL != ch, "Exhusted available memory\n"); int size; int number_of_bytes; int shift; if('!' == c) { number_of_bytes = 1; value = value & 0xFF; } else if('@' == c) { number_of_bytes = 2; value = value & 0xFFFF; } else if('~' == c) { number_of_bytes = 3; value = value & 0xFFFFFF; } else if('%' == c) { number_of_bytes = 4; value = value & 0xFFFFFFFF; } else { file_print("Given symbol ", stderr); fputc(c, stderr); file_print(" to express immediate value ", stderr); file_print(numerate_number(value), stderr); fputc('\n', stderr); exit(EXIT_FAILURE); } range_check(value, number_of_bytes); if(16 == ByteMode) { size = number_of_bytes * 2; shift = 4; } else if(8 == ByteMode) { size = number_of_bytes * 3; shift = 3; } else if(2 == ByteMode) { size = number_of_bytes * 8; shift = 1; } else { file_print("Got invalid ByteMode in express_number\n", stderr); exit(EXIT_FAILURE); } stringify(ch, size, ByteMode, value, shift); if(!BigEndian) LittleEndian(ch); else if(!BigBitEndian) reverseBitOrder(ch); return ch; } void eval_immediates(struct Token* p) { struct Token* i; for(i = p; NULL != i; i = i->next) { if(MACRO == i->type) continue; else if(NEWLINE == i->type) continue; else if('<' == i->Text[0]) continue; else if(NULL == i->Expression) { int value; if((X86 == Architecture) || (AMD64 == Architecture) || (ARMV7L == Architecture) || (AARM64 == Architecture)) { value = numerate_string(i->Text + 1); if(('0' == i->Text[1]) || (0 != value)) { i->Expression = express_number(value, i->Text[0]); } } else if(KNIGHT == Architecture) { value = numerate_string(i->Text); if(('0' == i->Text[0]) || (0 != value)) { i->Expression = express_number(value, '@'); } } else { file_print("Unknown architecture received in eval_immediates\n", stderr); exit(EXIT_FAILURE); } } } } void print_hex(struct Token* p) { struct Token* i; for(i = p; NULL != i; i = i->next) { if(NEWLINE == i->type) { if(NULL == i->next) fputc('\n', destination_file); else if((NEWLINE != i->next->type) && (MACRO != i->next->type)) fputc('\n', destination_file); } else if(i->type != MACRO) { file_print(i->Expression, destination_file); if(NEWLINE != i->next->type) fputc(' ', destination_file); } } } /* Standard C main program */ int main(int argc, char **argv) { BigEndian = TRUE; struct Token* head = NULL; Architecture = KNIGHT; destination_file = stdout; BigBitEndian = TRUE; ByteMode = 16; char* filename; char* arch; int option_index = 1; while(option_index <= argc) { if(NULL == argv[option_index]) { option_index = option_index + 1; } else if(match(argv[option_index], "--BigEndian")) { BigEndian = TRUE; option_index = option_index + 1; } else if(match(argv[option_index], "--LittleEndian")) { BigEndian = FALSE; option_index = option_index + 1; } else if(match(argv[option_index], "-A") || match(argv[option_index], "--architecture")) { arch = argv[option_index + 1]; if(match("knight-native", arch) || match("knight-posix", arch)) Architecture = KNIGHT; else if(match("x86", arch)) Architecture = X86; else if(match("amd64", arch)) Architecture = AMD64; else if(match("armv7l", arch)) Architecture = ARMV7L; else if(match("aarch64", arch)) Architecture = AARM64; else { file_print("Unknown architecture: ", stderr); file_print(arch, stderr); file_print(" know values are: knight-native, knight-posix, x86, amd64, armv7l and aarch64", stderr); exit(EXIT_FAILURE); } option_index = option_index + 2; } else if(match(argv[option_index], "-b") || match(argv[option_index], "--binary")) { ByteMode = 2; option_index = option_index + 1; } else if(match(argv[option_index], "-h") || match(argv[option_index], "--help")) { file_print("Usage: ", stderr); file_print(argv[0], stderr); file_print(" -f FILENAME1 {-f FILENAME2} (--BigEndian|--LittleEndian) ", stderr); file_print("[--architecture name]\nArchitectures: knight-native, knight-posix, x86, amd64 and armv7\n", stderr); file_print("To leverage octal or binary output: --octal, --binary\n", stderr); exit(EXIT_SUCCESS); } else if(match(argv[option_index], "-f") || match(argv[option_index], "--file")) { filename = argv[option_index + 1]; source_file = fopen(filename, "r"); if(NULL == source_file) { file_print("The file: ", stderr); file_print(argv[option_index + 1], stderr); file_print(" can not be opened!\n", stderr); exit(EXIT_FAILURE); } head = Tokenize_Line(head, filename); option_index = option_index + 2; } else if(match(argv[option_index], "-o") || match(argv[option_index], "--output")) { destination_file = fopen(argv[option_index + 1], "w"); if(NULL == destination_file) { file_print("The file: ", stderr); file_print(argv[option_index + 1], stderr); file_print(" can not be opened!\n", stderr); exit(EXIT_FAILURE); } option_index = option_index + 2; } else if(match(argv[option_index], "-O") || match(argv[option_index], "--octal")) { ByteMode = 8; option_index = option_index + 1; } else if(match(argv[option_index], "-V") || match(argv[option_index], "--version")) { file_print("M1 0.7.0\n", stdout); exit(EXIT_SUCCESS); } else { file_print("Unknown option\n", stderr); exit(EXIT_FAILURE); } } if(NULL == head) { file_print("Either no input files were given or they were empty\n", stderr); exit(EXIT_FAILURE); } head = reverse_list(head); identify_macros(head); line_macro(head); process_string(head); eval_immediates(head); preserve_other(head); print_hex(head); return EXIT_SUCCESS; } mescc-tools-Release_0.7.0/README.md000066400000000000000000000036251361470113000166670ustar00rootroot00000000000000## Copyright (C) 2017 Jeremiah Orians ## This file is part of mescc-tools. ## ## mescc-tools is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 3 of the License, or ## (at your option) any later version. ## ## mescc-tools is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with mescc-tools. If not, see . The master repository for this work is located at: https://savannah.nongnu.org/projects/mescc-tools # If you wish to contribute: pull requests can be made at https://github.com/oriansj/mescc-tools and https://gitlab.com/janneke/mescc-tools or patches/diffs can be sent via email to Jeremiah (at) pdp10 [dot] guru or join us on freenode's #bootstrappable These are a collection of tools written for use in bootstrapping # blood-elf A tool for generating ELF debug tables in M1-macro format from M1-macro assembly files # exec_enable A tool for marking files as executable, for systems that don't have chmod # get_machine A tool for identifying what hardware architecture you are running on # kaem A minimal shell script build tool that can be used for running shell scripts on systems that lack any shells. # hex2_linker The trivially bootstrappable linker that is designed to be introspectable by humans and should you so desire assemble hex programs that you write. # M1-macro The universal Macro assembler that can target any reasonable hardware architecture. With these tools on your system, you can always bootstrap real programs; even in the most catastrophic of situations, provided you keep your cool. mescc-tools-Release_0.7.0/blood-elf.c000066400000000000000000000137711361470113000174220ustar00rootroot00000000000000/* -*- c-file-style: "linux";indent-tabs-mode:t -*- */ /* Copyright (C) 2017 Jeremiah Orians * Copyright (C) 2017 Jan Nieuwenhuizen * This file is part of mescc-tools * * mescc-tools is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * mescc-tools is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with mescc-tools. If not, see . */ #include #include #include #include #include #include #define max_string 4096 //CONSTANT max_string 4096 #define TRUE 1 //CONSTANT TRUE 1 #define FALSE 0 //CONSTANT FALSE 0 int BITSIZE; int in_set(int c, char* s); int match(char* a, char* b); void file_print(char* s, FILE* f); void require(int bool, char* error); struct entry { struct entry* next; char* name; }; FILE* output; struct entry* jump_table; void consume_token(FILE* source_file, char* s) { int i = 0; int c = fgetc(source_file); require(EOF != c, "Can not have an EOF token\n"); do { s[i] = c; i = i + 1; require(max_string > i, "Token exceeds token length restriction\n"); c = fgetc(source_file); if(EOF == c) break; } while(!in_set(c, " \t\n>")); } void storeLabel(FILE* source_file) { struct entry* entry = calloc(1, sizeof(struct entry)); /* Prepend to list */ entry->next = jump_table; jump_table = entry; /* Store string */ entry->name = calloc((max_string + 1), sizeof(char)); consume_token(source_file, entry->name); /* Remove all entries that start with the forbidden char pattern :_ */ if('_' == entry->name[0]) { jump_table = jump_table->next; } } void line_Comment(FILE* source_file) { int c = fgetc(source_file); while(!in_set(c, "\n\r")) { if(EOF == c) break; c = fgetc(source_file); } } void purge_string(FILE* source_file) { int c = fgetc(source_file); while((EOF != c) && ('"' != c)) { c = fgetc(source_file); } } void first_pass(struct entry* input) { if(NULL == input) return; first_pass(input->next); FILE* source_file = fopen(input->name, "r"); if(NULL == source_file) { file_print("The file: ", stderr); file_print(input->name, stderr); file_print(" can not be opened!\n", stderr); exit(EXIT_FAILURE); } int c; for(c = fgetc(source_file); EOF != c; c = fgetc(source_file)) { /* Check for and deal with label */ if(58 == c) { storeLabel(source_file); } /* Check for and deal with line comments */ else if (c == '#' || c == ';') { line_Comment(source_file); } else if ('"' == c) { purge_string(source_file); } } fclose(source_file); } void output_debug(struct entry* node, int stage) { struct entry* i; for(i = node; NULL != i; i = i->next) { if(stage) { file_print(":ELF_str_", output); file_print(i->name, output); file_print("\n\"", output); file_print(i->name, output); file_print("\"\n", output); } else if(64 == BITSIZE) { file_print("%ELF_str_", output); file_print(i->name, output); file_print(">ELF_str\n!2\n!0\n@1\n&", output); file_print(i->name, output); file_print(" %0\n%10000\n%0\n", output); } else { file_print("%ELF_str_", output); file_print(i->name, output); file_print(">ELF_str\n&", output); file_print(i->name, output); file_print("\n%10000\n!2\n!0\n@1\n", output); } } } struct entry* reverse_list(struct entry* head) { struct entry* root = NULL; struct entry* next; while(NULL != head) { next = head->next; head->next = root; root = head; head = next; } return root; } /* Standard C main program */ int main(int argc, char **argv) { jump_table = NULL; struct entry* input = NULL; output = stdout; char* output_file = ""; BITSIZE = 32; int option_index = 1; while(option_index <= argc) { if(NULL == argv[option_index]) { option_index = option_index + 1; } else if(match(argv[option_index], "-h") || match(argv[option_index], "--help")) { file_print("Usage: ", stderr); file_print(argv[0], stderr); file_print(" --file FILENAME1 {--file FILENAME2} --output FILENAME\n", stderr); exit(EXIT_SUCCESS); } else if(match(argv[option_index], "--64")) { BITSIZE = 64; option_index = option_index + 1; } else if(match(argv[option_index], "-f") || match(argv[option_index], "--file")) { struct entry* temp = calloc(1, sizeof(struct entry)); temp->name = argv[option_index + 1]; temp->next = input; input = temp; option_index = option_index + 2; } else if(match(argv[option_index], "-o") || match(argv[option_index], "--output")) { output_file = argv[option_index + 1]; output = fopen(output_file, "w"); if(NULL == output) { file_print("The file: ", stderr); file_print(input->name, stderr); file_print(" can not be opened!\n", stderr); exit(EXIT_FAILURE); } option_index = option_index + 2; } else if(match(argv[option_index], "-V") || match(argv[option_index], "--version")) { file_print("blood-elf 0.7.0\n(Basically Launches Odd Object Dump ExecutabLe Files\n", stdout); exit(EXIT_SUCCESS); } else { file_print("Unknown option\n", stderr); exit(EXIT_FAILURE); } } /* Make sure we have a program tape to run */ if (NULL == input) { return EXIT_FAILURE; } /* Get all of the labels */ first_pass(input); /* Reverse their order */ jump_table = reverse_list(jump_table); file_print(":ELF_str\n!0\n", output); output_debug(jump_table, TRUE); if(64 == BITSIZE) file_print("%0\n:ELF_sym\n%0\n!0\n!0\n@1\n%0 %0\n%0 %0\n", output); else file_print("%0\n:ELF_sym\n%0\n%0\n%0\n!0\n!0\n@1\n", output); output_debug(jump_table, FALSE); file_print("\n:ELF_end\n", output); return EXIT_SUCCESS; } mescc-tools-Release_0.7.0/build.sh000077500000000000000000000162111361470113000170410ustar00rootroot00000000000000#! /bin/sh # Copyright © 2017 Jan Nieuwenhuizen # Copyright © 2017 Jeremiah Orians # # This file is part of mescc-tools. # # mescc-tools is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 3 of the License, or (at # your option) any later version. # # mescc-tools is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with mescc-tools. If not, see . set -eux M2=${M2-../M2-Planet} MESCC_TOOLS_SEED=${MESCC_TOOLS_SEED-../mescc-tools-seed} ######################################### # Phase-0 Build from external binaries # # To be replaced by a trusted path # ######################################### # Make sure we have our required output directory [ -e bin ] || mkdir -p bin # blood-elf $M2/bin/M2-Planet \ -f $M2/functions/exit.c \ -f $M2/functions/file.c \ -f functions/file_print.c \ -f $M2/functions/malloc.c \ -f $M2/functions/calloc.c \ -f functions/match.c \ -f blood-elf.c \ --debug \ -o blood-elf.M1 || exit 1 # Build debug footer $MESCC_TOOLS_SEED/blood-elf-0 \ -f blood-elf.M1 \ -o blood-elf-footer.M1 || exit 2 # Macro assemble with libc written in M1-Macro $MESCC_TOOLS_SEED/M1-0 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f blood-elf.M1 \ -f blood-elf-footer.M1 \ --LittleEndian \ --architecture x86 \ -o blood-elf.hex2 || exit 3 # Resolve all linkages $MESCC_TOOLS_SEED/hex2-0 \ -f elf_headers/elf32-debug.hex2 \ -f blood-elf.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/blood-elf-0 \ --exec_enable || exit 4 # Build # M1-macro phase $M2/bin/M2-Planet \ -f $M2/functions/exit.c \ -f $M2/functions/file.c \ -f functions/file_print.c \ -f $M2/functions/malloc.c \ -f $M2/functions/calloc.c \ -f functions/match.c \ -f functions/numerate_number.c \ -f functions/string.c \ -f functions/in_set.c \ -f M1-macro.c \ --debug \ -o M1-macro.M1 || exit 5 # Build debug footer $MESCC_TOOLS_SEED/blood-elf-0 \ -f M1-macro.M1 \ -o M1-macro-footer.M1 || exit 6 # Macro assemble with libc written in M1-Macro $MESCC_TOOLS_SEED/M1-0 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f M1-macro.M1 \ -f M1-macro-footer.M1 \ --LittleEndian \ --architecture x86 \ -o M1-macro.hex2 || exit 7 # Resolve all linkages $MESCC_TOOLS_SEED/hex2-0 \ -f elf_headers/elf32-debug.hex2 \ -f M1-macro.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/M1-0 \ --exec_enable || exit 8 # hex2 $M2/bin/M2-Planet \ -f $M2/functions/exit.c \ -f $M2/functions/file.c \ -f functions/file_print.c \ -f $M2/functions/malloc.c \ -f $M2/functions/calloc.c \ -f functions/match.c \ -f functions/numerate_number.c \ -f functions/in_set.c \ -f $M2/functions/stat.c \ -f hex2_linker.c \ --debug \ -o hex2_linker.M1 || exit 9 # Build debug footer $MESCC_TOOLS_SEED/blood-elf-0 \ -f hex2_linker.M1 \ -o hex2_linker-footer.M1 || exit 10 # Macro assemble with libc written in M1-Macro $MESCC_TOOLS_SEED/M1-0 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f hex2_linker.M1 \ -f hex2_linker-footer.M1 \ --LittleEndian \ --architecture x86 \ -o hex2_linker.hex2|| exit 11 # Resolve all linkages $MESCC_TOOLS_SEED/hex2-0 \ -f elf_headers/elf32-debug.hex2 \ -f hex2_linker.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/hex2-0 \ --exec_enable || exit 12 ######################### # Phase-1 Self-host # ######################### # blood-elf # Build debug footer ./bin/blood-elf-0 \ -f blood-elf.M1 \ -o blood-elf-footer.M1 || exit 13 # Macro assemble with libc written in M1-Macro ./bin/M1-0 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f blood-elf.M1 \ -f blood-elf-footer.M1 \ --LittleEndian \ --architecture x86 \ -o blood-elf.hex2 || exit 14 # Resolve all linkages ./bin/hex2-0 \ -f elf_headers/elf32-debug.hex2 \ -f blood-elf.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/blood-elf \ --exec_enable || exit 15 # M1-macro # Build debug footer ./bin/blood-elf \ -f M1-macro.M1 \ -o M1-macro-footer.M1 || exit 16 # Macro assemble with libc written in M1-Macro ./bin/M1-0 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f M1-macro.M1 \ -f M1-macro-footer.M1 \ --LittleEndian \ --architecture x86 \ -o M1-macro.hex2 || exit 17 # Resolve all linkages ./bin/hex2-0 \ -f elf_headers/elf32-debug.hex2 \ -f M1-macro.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/M1 \ --exec_enable || exit 18 # hex2 # Build debug footer ./bin/blood-elf \ -f hex2_linker.M1 \ -o hex2_linker-footer.M1 || exit 19 # Macro assemble with libc written in M1-Macro ./bin/M1 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f hex2_linker.M1 \ -f hex2_linker-footer.M1 \ --LittleEndian \ --architecture x86 \ -o hex2_linker.hex2|| exit 20 # Resolve all linkages ./bin/hex2-0 \ -f elf_headers/elf32-debug.hex2 \ -f hex2_linker.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/hex2 \ --exec_enable || exit 21 # Clean up after ourself rm -f bin/blood-elf-0 bin/M1-0 bin/hex2-0 # Build pieces that were not needed in bootstrap # but are generally useful # get_machine $M2/bin/M2-Planet \ -f $M2/functions/exit.c \ -f $M2/functions/file.c \ -f functions/file_print.c \ -f $M2/functions/malloc.c \ -f $M2/functions/calloc.c \ -f $M2/functions/uname.c \ -f get_machine.c \ --debug \ -o get_machine.M1 || exit 22 # Build debug footer ./bin/blood-elf \ -f get_machine.M1 \ -o get_machine-footer.M1 || exit 23 # Macro assemble with libc written in M1-Macro ./bin/M1 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f get_machine.M1 \ -f get_machine-footer.M1 \ --LittleEndian \ --architecture x86 \ -o get_machine.hex2 || exit 24 # Resolve all linkages ./bin/hex2 \ -f elf_headers/elf32-debug.hex2 \ -f get_machine.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/get_machine \ --exec_enable || exit 25 # exec_enable $M2/bin/M2-Planet \ -f $M2/functions/file.c \ -f functions/file_print.c \ -f $M2/functions/exit.c \ -f $M2/functions/stat.c \ -f exec_enable.c \ --debug \ -o exec_enable.M1 || exit 26 # Build debug footer ./bin/blood-elf \ -f exec_enable.M1 \ -o exec_enable-footer.M1 || exit 27 # Macro assemble with libc written in M1-Macro ./bin/M1 \ -f $M2/test/common_x86/x86_defs.M1 \ -f $M2/functions/libc-core.M1 \ -f exec_enable.M1 \ -f exec_enable-footer.M1 \ --LittleEndian \ --architecture x86 \ -o exec_enable.hex2 || exit 28 # Resolve all linkages ./bin/hex2 \ -f elf_headers/elf32-debug.hex2 \ -f exec_enable.hex2 \ --LittleEndian \ --architecture x86 \ --BaseAddress 0x8048000 \ -o bin/exec_enable \ --exec_enable || exit 29 # TODO # kaem mescc-tools-Release_0.7.0/catm.c000066400000000000000000000033201361470113000164700ustar00rootroot00000000000000/* Copyright (C) 2019 Jeremiah Orians * This file is part of mescc-tools * * mescc-tools is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * mescc-tools is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with mescc-tools. If not, see . */ #include #include #include #include void file_print(char* s, FILE* f); // CONSTANT BUFFER_SIZE 4096 #define BUFFER_SIZE 4096 int main(int argc, char** argv) { if(2 > argc) { file_print("catm requires 2 or more arguments\n", stderr); exit(EXIT_FAILURE); } int output = open(argv[1], 577 , 384); if(-1 == output) { file_print("The file: ", stderr); file_print(argv[1], stderr); file_print(" is not a valid output file name\n", stderr); exit(EXIT_FAILURE); } int i; int bytes; char* buffer = calloc(BUFFER_SIZE + 1, sizeof(char)); int input; for(i = 2; i <= argc ; i = i + 1) { input = open(argv[i], 0, 0); if(-1 == input) { file_print("The file: ", stderr); file_print(argv[i], stderr); file_print(" is not a valid input file name\n", stderr); exit(EXIT_FAILURE); } keep: bytes = read(input, buffer, BUFFER_SIZE); write(output, buffer, bytes); if(BUFFER_SIZE == bytes) goto keep; } free(buffer); return EXIT_SUCCESS; } mescc-tools-Release_0.7.0/check.sh000077500000000000000000000025511361470113000170210ustar00rootroot00000000000000#! /usr/bin/env bash # Copyright © 2017 Jan Nieuwenhuizen # Copyright © 2017 Jeremiah Orians # Copyright (C) 2019 ng0 # # This file is part of mescc-tools # # mescc-tools is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 3 of the License, or (at # your option) any later version. # # mescc-tools is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with mescc-tools. If not, see . set -eux [ -e bin ] || mkdir -p bin [ -f bin/M1 ] || exit 1 [ -f bin/hex2 ] || exit 2 [ -f bin/blood-elf ] || exit 3 #[ -f bin/kaem ] || exit 4 [ -f bin/get_machine ] || exit 5 [ -f bin/exec_enable ] || exit 6 [ -e test/results ] || mkdir -p test/results ./test/test0/hello.sh ./test/test1/hello.sh ./test/test2/hello.sh ./test/test3/hello.sh ./test/test4/hello.sh ./test/test5/hello.sh ./test/test6/hello.sh ./test/test7/hello.sh ./test/test8/hello.sh ./test/test9/hello.sh ./test/test10/hello.sh ./test/test11/hello.sh . sha256.sh sha256_check test/test.answers mescc-tools-Release_0.7.0/docs/000077500000000000000000000000001361470113000163325ustar00rootroot00000000000000mescc-tools-Release_0.7.0/docs/M1.1000066400000000000000000000042041361470113000166710ustar00rootroot00000000000000.\"Made with Love .TH M1 1 "JULY 2019" Linux "User Manuals" .SH NAME M1 \- The universal Macro assembler .SH SYNOPSIS .na M1 --architecture ARCHITECTURE --file FILE [--output FILE --octal --binary --BigEndian --LittleEndian] .SH DESCRIPTION M1 is the most minimal cross-platform macro assembler possible .br Leveraging a few grammar details to provide the user with a rich vocabulary of ways of expressing their desired output and input; thus eliminating surprise from the generation process. .br At it's core is DEFINE MACRO EXPANDED .br Where one can define an assembly instruction like: DEFINE INT_80 CD80 and know that everywhere INT_80 is placed in the input file there will be the hex encoding of that instruction in the output file or if no output file is specified the output will be send to standard out. .br The supported ARCHITECTURES are as follows: knight-native, knight-posix, x86, amd64, armv7l and aarch64. If you fail to specify an architecture, the default of knight-native will be used. .br M1 also supports the generation of 8, 16 and 32bit integers provided they are encoded as !NUM, @NUM or %NUM respectively. eg !-1, @42 and %8675309 You however can not mix integers with definitions as macros are always applied first and will not be evaluated further. Should you wish to specify the bit and byte encoding of the integers to match your target --BigEndian and --LittleEndian Should you wish for the output to be something other than hex format, you can specify the exact output you desire with: --octal or --binary .SH EXAMPLES Typically, M1 will be called in scripts used in bootstrapping .br # M1 -f x86.defs -f libc-core.M1 -f cc.M1 -f cc-footer.M1 --LittleEndian --architecture x86 -o cc.hex2 .br .SH COMPATIBILITY M1 is compatible with all Turing complete machines; even the ones that try to be Turing complete -1 .SH AUTHORS Jeremiah Orians .br Jan (janneke) Nieuwenhuizen .SH COPYRIGHT Copyright 2016-2019 Jeremiah Orians .br Copyright 2017 Jan Nieuwenhuizen .br License GPLv3+. .SH "SEE ALSO" hex2(1), blood-elf(1), kaem(1), syscalls(2) mescc-tools-Release_0.7.0/docs/blood-elf.1000066400000000000000000000025461361470113000202660ustar00rootroot00000000000000.\"Made with Love .TH blood-elf 1 "JULY 2019" Linux "User Manuals" .SH NAME blood-elf - Get mescc with dwarfs .SH SYNOPSIS .na blood-elf --file FILE [--output FILE --64] .SH DESCRIPTION blood-elf exists to generate ELF debug tables in M1-macro format from M1-macro assembly files. .br At its core is read until you find :LABEL and then add it to the list of things to output. It will ignore labels that have '_' prefixing their name. eg :_foo will not get an entry. If no output is specified, the result will be sent to Standard Out. .br Fortunately the only architecture difference that you need to concern yourself with is if your binary is going to be 64bits or 32bits (which is the default) and to pass the flag: --64 should you need that alternate format .SH EXAMPLES Typically, blood-elf will be called in scripts used in bootstrapping .br # blood-elf -f cc.M1 -o cc-footer.M1 .br # blood-elf --file cc.M1 --64 --output cc-footer.M1 .br .SH COMPATIBILITY blood-elf is compatible with all Turing complete machines; even the ones that try to be Turing complete -1. .SH AUTHORS Jeremiah Orians .br Jan (janneke) Nieuwenhuizen .SH COPYRIGHT Copyright 2016-2019 Jeremiah Orians .br Copyright 2017 Jan Nieuwenhuizen .br License GPLv3+. .SH "SEE ALSO" M1(1), hex2(1), kaem(1), syscalls(2) mescc-tools-Release_0.7.0/docs/get_machine.1000066400000000000000000000031111361470113000206530ustar00rootroot00000000000000.\"Made with Love .TH get_machine 1 "JULY 2019" Linux "User Manuals" .SH NAME get_machine - identify running hardware architecture .SH SYNOPSIS .na get_machine [ --exact --override OVERRIDE --OS ] .SH DESCRIPTION get_machine exists to make simple shell scripts that need to know that hardware architecture it is assuming or what host operating system is being used. .br At its core is figure out the general hardware architecture and return it as quickly as possible. Although it is sometimes useful to return something different; which is why --override exits and thus what ever is supplied by with it is returned. Scripts that wish to expose that can leverage the environment variable GET_MACHINE_FLAGS to allow efficient overriding in their scripts. .br If one wishes for something more exact than x86 or amd64, the option --exact will return more specific values like i686-pae. .br Should one desire to know the host operating system: --OS .br A word of warning; --override always will take top precedence and can return anything you desire. .SH EXAMPLES Typically, get_machine will be in scripts used in bootstrapping .br # get_machine .br # get_machine --override "I am the very model of a modern major general" .br # get_machine ${GET_MACHINE_FLAGS} .br .SH COMPATIBILITY get_machine is compatible with all Turing complete machines; even the ones that try to be Turing complete -1 .SH AUTHORS Jeremiah Orians .SH COPYRIGHT Copyright 2016-2019 Jeremiah Orians .br License GPLv3+. .SH "SEE ALSO" M1(1), hex2(1), blood-elf(1), kaem(1), syscalls(2) mescc-tools-Release_0.7.0/docs/hex2.1000066400000000000000000000047341361470113000172720ustar00rootroot00000000000000.\"Made with Love .TH hex2 1 "JULY 2019" Linux "User Manuals" .SH NAME hex2 - The trivially bootstrappable linker that is designed to be introspectable by humans .SH SYNOPSIS .na hex2 --architecture ARCHITECTURE --BaseAddress ADDRESS --file FILE [--output FILE --exec_enable] .SH DESCRIPTION hex2 is designed to allow humans to write elf and other binary files by hand in a format that allows comments and ease of understanding. .br At its core is read 2 hex characters add them together and output a single byte. You can override this and use binary or octal input if you so desire, using the --octal or --binary option. .br If no output file is specified the output will be send to standard out. By default the file will not be executable unless the option: --exec_enable is also passed. .br The supported ARCHITECTURES are as follows: knight-native, knight-posix, x86, amd64, armv7l and aarch64. If you fail to specify an architecture, the default of knight-native will be used. .br The base address for which the binary is to be loaded into memory and thus the relative and absolute pointers should be based, is passed via --BaseAddress if it is not provided the default value of ZERO will be assumed. .br hex2 also support labels in the :LABEL format and relative and absolute pointers to those labels in 8, 16, 24 or 32bit sizes. !LABEL, @LABEL, ~LABEL and %LABEL for 8, 16, 24 and 32bit relative addresses respectively and $LABEL and &LABEL for 16 and 32bit absolute addresses respectively. Should you wish to specify the bit and byte encoding of the addresses to match your target --BigEndian and --LittleEndian On architectures that require word alignment the < and ^ characters have a special meaning; particulary pad to word and use word base address calculation rather than standard byte based address calculation; generally seen in the form: ^~LABEL EB for calls in ARM .SH EXAMPLES Typically, hex2 will be called in scripts used in bootstrapping .br # hex2 -f ELF-armv7l.hex2 -f cc.hex2 --LittleEndian --architecture armv7l --BaseAddress 0x10000 -o cc --exec_enable .br .SH COMPATIBILITY hex2 is compatible with all Turing complete machines; even the ones that try to be Turing complete -1 .SH AUTHORS Jeremiah Orians .br Jan (janneke) Nieuwenhuizen .SH COPYRIGHT Copyright 2016-2019 Jeremiah Orians .br Copyright 2017 Jan Nieuwenhuizen .br License GPLv3+. .SH "SEE ALSO" M1(1), blood-elf(1), kaem(1), syscalls(2) mescc-tools-Release_0.7.0/docs/kaem.1000066400000000000000000000032011361470113000173250ustar00rootroot00000000000000.\"Made with Love .TH kaem 1 "JULY 2019" Linux "User Manuals" .SH NAME kaem - Like running with scissors but shell scripts without a shell .SH SYNOPSIS .na kaem [--file FILE --strict --verbose --nightmare-mode] .SH DESCRIPTION kaem exists to be the most minimal shell needed in the bootstrap processes; that has the ability to function as an init thus allowing the bootstrap to occur with only itself and a hex0 assembler as the only binaries running on the system. .br At its core is read a line (except when the line is terminated with a '\\', when it is read the next line too) and then collect the arguments into an array, lookup the program in the path provided by the name or if not found in the ENVIRONMENT and then execute the program with the specified options. .br If no filename is passed, it then attempts to execute a file called kaem.run; which is the standard name for kaem scripts. If you wish for kaem to stop when one of the lines throws or returns an error code, simply add --strict. If you wish to see what is being executed, simply add --verbose. If you hate dealing with an environment and want to eliminate it entirely --nightmare-mode. .SH EXAMPLES Typically, kaem will be in running scripts used in bootstrapping .br # kaem --verbose --strict .br # kaem -f bootstrap.sh --nightmare-mode .br .SH COMPATIBILITY kaem is compatible with all Turing complete machines; even the ones that try to be Turing complete -1 .SH AUTHORS Jeremiah Orians .SH COPYRIGHT Copyright 2016-2019 Jeremiah Orians .br License GPLv3+. .SH "SEE ALSO" M1(1), hex2(1), blood-elf(1), get_machine(1), syscalls(2) mescc-tools-Release_0.7.0/elf_headers/000077500000000000000000000000001361470113000176435ustar00rootroot00000000000000mescc-tools-Release_0.7.0/elf_headers/elf32-ARM-debug.hex2000066400000000000000000000164751361470113000230640ustar00rootroot00000000000000### Copyright (C) 2016 Jeremiah Orians ### Copyright (C) 2017 Jan Nieuwenhuizen ### This file is part of stage0. ### ### stage0 is free software: you can redistribute it and/or modify ### it under the terms of the GNU General Public License as published by ### the Free Software Foundation, either version 3 of the License, or ### (at your option) any later version. ### ### stage0 is distributed in the hope that it will be useful, ### but WITHOUT ANY WARRANTY; without even the implied warranty of ### MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ### GNU General Public License for more details. ### ### You should have received a copy of the GNU General Public License ### along with stage0. If not, see . ### stage0's hex2 format ### !