Skip to content

Commit

Permalink
Wording
Browse files Browse the repository at this point in the history
  • Loading branch information
0xZ0F committed Jan 26, 2020
1 parent 83380cc commit 1367801
Show file tree
Hide file tree
Showing 9 changed files with 44 additions and 17 deletions.
2 changes: 1 addition & 1 deletion Chapter 2 - BinaryBasics/2.1 NumberSystems.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Hexadecimal is very similar but can be a little confusing for some people. You s

A = 10, B = 11, ..., F = 15

Hexadecimal numbers are usually given a "0x" prefix or a "h" suffix such as 0xFF or FFh.
Hexadecimal numbers are usually given a "0x" prefix or the suffix "h" such as 0xFF or FFh.

0x4A = (16<sup>1</sup> * 4d) + (16<sup>0</sup> * 10d) = 64d + 10d = 74d.

Expand Down
23 changes: 22 additions & 1 deletion Chapter 2 - BinaryBasics/2.6 Mindset.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# 2.6 Mindset
Having the right mindset can be really helpful. Something to understand is that computers are extremely stupid. They operate on "blunt logic" and they don't make any assumptions (note that this only holds true as long as Skynet doesn't become a reality). You can think of computers as trains, they don't stop and only go in a very specific and direct path as designated by the tracks. If there's a child on the tracks it's up to the people controlling the track to divert the train. This is why Windows gives you the Blue Screen of Death (BSOD) when there is a kernel error. If the OS doesn't stop that error, catastrophic damage could occur.
Having the right mindset can be really helpful. Something to understand is that computers are extremely stupid. They operate with pure logic and they don't make any assumptions.

To demonstrate this, let's say you tell the computer to write the numbers 1-10 and make all of the even numbers red.
This is what you might expect:
<p>
<img src="[ignore]/NumsCorrect.png">
</p>
That looks correct, all even numbers are red just as expected.

The computer may also generate the following:
<p>
<img src="[ignore]/NumsRed.png">
</p>
Once again, the list is valid and follows the rules. All even numbers are red.

Some people will get extremely confused by this because their brains will flip the rule and tell them that all red numbers are even, which is not true according to what we told the computer. Computers won't flip the rules or apply any sort of assumptions like a human might.

In fact, the rules don't dictate anything about the odd numbers, so they can be any color we want!
<p>
<img src="[ignore]/NumsRainbow.png">
</p>
Where did the 9 go? The computer must have made an error and forgot to write the 9... or did it? Maybe the computer made the 9 white and it blends in with the background. This is completely valid because there is no rule against it.

### What is a Protocol?
TCP, UDP, HTTP(s), FTP, and SMTP are all protocols. Why? What are protocols? Protocols are simply templates that are used to specify what data is where. Let's use an example.
Expand Down
Binary file added Chapter 2 - BinaryBasics/[ignore]/NumsCorrect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chapter 2 - BinaryBasics/[ignore]/NumsRainbow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Chapter 2 - BinaryBasics/[ignore]/NumsRed.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 8 additions & 4 deletions Chapter 3 - Assembly/3.2 MemoryLayout.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# 3.2 Memory Layout
The systems memory is divided into several sections. I prefer to imagine the stack with low addresses at the top and high addresses at the bottom. The reason for this is because it's more like a normal numeric list and it's how you'll most often see it being represented. Also, I'm pretty sure that's how your computer sees it. Be warned though, you will see people represent the memory layout as starting from higher addresses. If you don't know what I just said, don't worry about it.
The system's memory is organized in a specific way. This is done to make sure everything has a place to reside in.

## Assembly Segments
There are different segments/sections in which data or code is stored.
Expand All @@ -9,12 +9,13 @@ There are different segments/sections in which data or code is stored.
The start of the program code is declared with "global _start".

## Overview of Memory Sections
* **Stack** - Area in memory that can be used quickly for static data allocation. Data is read and written as "last-in-first-out" (LIFO). The LIFO structure of the stack is often represented with a stack of plates. You can't simply take out the third plate from the top, you have to take off one plate at a time to get to it. You can only access the piece of data that's on the top of the stack, so to access other data you need to move what's on top out of the way. When I said that the stack holds static data I'm referring to data that has a known length such as an integer. We know that an integer will only be 4 bytes so we can throw that on the stack. Unless a maximum length is specified, user input should be stored on the heap because the data has a variable size. When you put data on top of the stack you **push** it onto the stack. When you remove a piece of data off the top of the stack you **pop** it off the stack. There are two registers that are used to keep track of the stack. **When you add data to the stack, the stack "grows" towards lower memory addresses.** The **stack pointer (RSP)** is used to keep track of the top of the stack and the **base pointer (RBP)** is used to keep track of the base/bottom of the stack.
* **Heap** - Similar to the stack but used for dynamic allocation and it's a little slower to access. The heap is typically used for data that is more dynamic (changing or unpredictable). Things such as structures and user input would be stored on the heap. If the size of the data isn't known at compile-time, it's usually stored on the heap. **When you add data to the heap it grows towards higher addresses.**
* **Stack** - Area in memory that can be used quickly for static data allocation. Imagine the stack with low addresses at the top and high address at the bottom. This is identical to a normal numerical list. Be warned, you will sometimes see the stack represented the other way around. Data is read and written as "last-in-first-out" (LIFO). The LIFO structure of the stack is often represented with a stack of plates. You can't simply take out the third plate from the top, you have to take off one plate at a time to get to it. You can only access the piece of data that's on the top of the stack, so to access other data you need to move what's on top out of the way. When I said that the stack holds static data I'm referring to data that has a known length such as an integer. The size of an integer is defined at compile-time, the size is typically 4 bytes, so we can throw that on the stack. Unless a maximum length is specified, user input should be stored on the heap because the data has a variable size. *However*, the address/location of the input will probably be stored on the stack for future reference. When you put data on top of the stack you **push** it onto the stack. **When data is pushed onto the stack, the stack grows towards lower memory addresses.** When you remove a piece of data off the top of the stack you **pop** it off the stack. There are two registers that are used to keep track of the stack. The **stack pointer (RSP/ESP/SP)** is used to keep track of the top of the stack and the **base pointer (RBP/EBP/BP)** is used to keep track of the base/bottom of the stack.
* **Heap** - Similar to the stack but used for dynamic allocation and it's a little slower to access. The heap is typically used for data that is more dynamic (changing or unpredictable). Things such as structures and user input might be stored on the heap. If the size of the data isn't known at compile-time, it's usually stored on the heap. **When you add data to the heap it grows towards higher addresses.**
* **Program Image** - This is the program loaded into memory. On Windows, this is typically a **Portable Executable (PE)**.
* **DLL** - **Dynamic Link Library (DLL)**. Libraries that can be used by programs.
> Don't worry too much about the TEB and PEB for now.
* **TEB** - The **Thread Environment Block (TEB)** stores information about the currently running thread(s).
* **PEB** - The **Process Environment Block (PEB)** stores information about the process and the loaded modules. One piece of information the PEB stores is "BeingDebugged" which can be used to determine if the current process is being debugged.
* **PEB** - The **Process Environment Block (PEB)** stores information about the process and the loaded modules. One piece of information the PEB contains is "BeingDebugged" which can be used to determine if the current process is being debugged.
MSDN: https://docs.microsoft.com/en-us/windows/desktop/api/winternl/ns-winternl-_peb

Here is a general overview of how memory is laid out in Windows. **This is extremely simplified.**
Expand Down Expand Up @@ -73,6 +74,9 @@ If we were going to refer to the data 12345678 we would say that it's stored at

Again, this is quite a simple concept but you need to be sure that you understand it.

## RBP and x64
On x64, It's common to see RBP used in a non-traditional way. Sometimes only RSP is used to point to the stack, and RBP is used for general data (similar to RAX). This sort of behavior is common and is one of many compiler optimizations.

[<- Previous Lesson](3.1%20Registers.md)
[Next Lesson ->](3.3%20Instructions.md)

Expand Down
16 changes: 9 additions & 7 deletions Chapter 3 - Assembly/3.3 Instructions.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,27 @@
# 3.3 Instructions
Let's talk about some Assembly instructions. Before we get started there are three different terms you should know: **immediate**, **register**, and **memory**.
* An **immediate value** is something like the number 12 (the kind of number we humans use). An immediate value is not a memory address or register, instead, it's some sort of constant.
The ability to read and comprehend Assembly code is vital to reverse engineering. There are roughly 1,500 instructions, however, a majority of the instructions are not commonly used or they're just variations (such as MOV and MOVS). Just like in high-level programming, don't hesitate to look up something you don't know.

Before we get started there are three different terms you should know: **immediate**, **register**, and **memory**.
* An **immediate value** (or just immediate, sometimes IM) is something like the number 12. An immediate value is *not* a memory address or register, instead, it's some sort of constant data.
* A **register** is referring to something like RAX, RBX, R12, AL, etc.
* **Memory** or a **memory address** refers to a location in memory (a memory address) such as 0x7FFF842B.

> You may see a semicolon at the end of, or in-between, a few Assembly instructions. This is because the semicolon (;) is used to write a comment in Assembly.
It's important to know the format of instructions which is as follows:
**(Instruction/Opcode/Mnemonic) \<Destination Operand\>, \<Source Operand\>**
> I will be referring to Instructions/Opcodes/Mnemonics as instructions. They all mean the same thing in Assembly.
> I will be referring to Instructions/Opcodes/Mnemonics as instructions, just note that some people call it different things.
Example:
```asm
mov RAX, 5
```
MOV is the instruction, RAX is the destination operand, and 5 is the source operand. Capitalization of instructions or operands does not matter. You will see me use a mixture of all letters capitalized and all letters lowercase.
MOV is the instruction, RAX is the destination operand, and 5 is the source operand. Capitalization of instructions or operands does not matter. You will see me use a mixture of all letters capitalized and all letters lowercase. In the example given, 5 is an immediate value because it's not a valid memory address and it's certainly not a register.

# Common Instructions
## Data:
**MOV** is used to move/store the source operand into the destination. The source doesn't have to be an immediate value like it is in the following example. In the following example, the immediate value of 5 is being moved into RAX.
This is equivalent to RAX = 5.
**This is equivalent to RAX = 5.**
```asm
mov RAX, 5
```
Expand Down Expand Up @@ -96,8 +98,8 @@ ret

**NOP** is short for No Operation. This instruction effectively does nothing. It's typically used for padding. Padding is done because some parts of code like to be on specific boundaries such as 16 bit boundaries, or 32 bit boundaries.

## Back To The Example In 0x201
Remember the example from 0x201? Here it is:
## Back To The Example In 3.1
Remember the example from 3.1? Here it is:
```c
if(x == 4){
func1();
Expand Down
2 changes: 1 addition & 1 deletion Credit.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ These are people that have helped improve this course or support me via [Patreon
# Patreon Supporters
> Last names are not shown for privacy reasons. If you're a Patreon supporter and you wish to have your last name shown, just let me know!
* Tim S
* Tim S.
* Yury

# Bug Squashers:
Expand Down
6 changes: 3 additions & 3 deletions FilesNeeded/FilesNeeded.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ https://github.com/0xZ0F/Z0FCourse_ReverseEngineering/releases
* RunDLL.exe - Program that uses the DLL.
* README.txt - Contains the information you just read.

* ## 0x900 - Malware - WIP
* ## Chapter WIP - Malware
* Debug Files - pdb and obj files. Helps with debugging.
* Source Code - Source code of the malware and its dropper.
* **THE FOLLOWING ARE TO BE USED AT YOUR OWN RISK. I AM NOT LIABLE FOR ANYTHING DONE:**
* Dropper.Z0F - Program that acts as a dropper for the malware. Rename to Dropper.exe to run it.
* Malware.Z0F - Malware that wil be reverse engineered. Rename to Malware.exe to run it.
* Dropper.Z0F - Program that acts as a dropper for the malware.
* Malware.Z0F - Malware that wil be reverse engineered.
* README.txt - Contains the information you just read.

0 comments on commit 1367801

Please sign in to comment.