XMC4500学习笔记

1. Embedded Programs

**† arm-none-eabi-(gcc | objcopy | objdump)**

(1) What “Embedded Programs” Mean

Embedded programs are “bare metal” programs, i.e. without an underlying operating system (OS). You have full control over the hardware of embedded systems. You also have full responsibility, no convenient functions are available.

(2) Retrace Build Steps

Cross Compiler

arm-non-eabi-gcc compiles header files, libraries and main.c and generate *.o files. -I options are for specifying directories to search for header files to be included. Here the device header XMC4500.h are the header files for the GPIO driver.

*.o files are compiled but unlinked versions of source files, not human-readable.

Cross Linker

arm-non-eabi-gcc links all *.o files into main.elf file. -T option gives the linker description file *.ld.

*.elf files (executable and linkable format) are compiled and linked programs, ready to execute on the architecture they are built for.

That means, ELF format is used in two ways: The linker reads it as an input that can be linked with other objects. The loader interprets it as an executable program.

Important ELF sections

objcopy

arm-non-eabi-objcopy creates *.hex files, which is pure machine code together with information about instruction addresses, technically human-readable.

objdump

arm-non-eabi-objdump creates *.lst files, which is a human-readable copy of parts of the *.elf file. What to put here is assigned by options of objdump, but usually it is:

– Section headers (where .data, .bss, etc. are located and how large they are)

– Disassembly of the .text section interleaved with the C instructions it was
compiled from. (.text部分的反汇编以及对应的的C指令)

(3) Difference from Computer Programs

Cross-compiler instead of compiler
Device header and device linker file needed
Often additional libraries and drivers necessary
Programming onto uC as another final step

2. XMC4500 Board

(1) Functional Blocks

CPU, memory, clock and reset, timers/counters, communications, analogs, GPIOs

(2) Peripherals

CCU4

provides several counters for PWM generation, counting external events

Measures analog signals

(3) Accessing Peripherals

Memory-mapped: I/O devices are accessed through memory addresses, just like normal memory, the processor can access them by reading or writing to those memory addresses. It is generally more flexible and easier to use, as it allows the processor to access I/O devices using the same instructions and addressing modes as it uses for normal memory. BUT the memory bus has to connect to each and every peripheral, whereas a longer bus reduces the maximal clock frequency, especially when going off-chip.

Port-mapped: I/O devices are accessed through dedicated I/O ports, each I/O device is assigned a unique port address, and the processor can access the device by reading or writing to that port address via special instructions IN, OUT instead of LD, ST. It is generally faster and more efficient than memory mapped I/O, as it requires fewer bus transactions and can be implemented using a simpler addressing scheme. BUT specialized instructions and addressing modes are needed for different devices, which leads to additional complexity.

3. Cortex M4

(1) Functional Blocks

The Cortex-M4 has a three-stage pipeline (Fetch, Decode, Execute) with the following functional blocks.

(2) Registers

1. Registers for Function Arguments

2. Caller/Callee-saved Registers

Registers are special memory locations in the processor that are used to store data temporarily while a program is running.

Caller-saved registers (R0 – R3, R12) are registers that are expected to be preserved by a called function. This means that the calling function (the “caller”) is responsible for saving the values of these registers before making the function call, and restoring them after the function returns.

Callee-saved registers (R4 – R11), on the other hand, are registers that are expected to be preserved by the called function, and are guaranteed to retain their values after the function returns. This means that the called function (the “callee”) is responsible for saving the values of these registers before modifying them, and restoring them before returning control to the calling function. To do this, the callee must either leave them unchanged or push them on the stack in the beginning and pop them back before return.

3. Special Registers

R13 (Stack Pointer): The SP determines the border between allocated and unallocated memory on the stack. If a function requires stack space, it allocates it by decreasing the SP.

R14 (Link Register): The LR can be seen as a kind of hidden argument register that tells a callee the return address.

R15 (Program Counter)

4. PSR (Program status registers)

ApplicationPSR: N, Z, C, V (flags)

ExecutionPSR: IT (if-then instruction status bits), T (thumb state, always 1 for Cortex-M)

InterruptPSR: EN (exception number)

4. Assembler

(1) Instructions

Data Processing

ADD r3, r4, r5;

Add the contents of r4 and r5 and store the result in r3.

ADC r0, r1, r2;

Add the contents of r1 and r2 and store the result in r0, taking into account the carry flag.

SUB r3, r4, r5;

Subtract the contents of r5 from r4 and store the result in r3.

NEG r12, r13;

Negate the contents of r13 and store the result in r12.

Data Move

MOV r0, #0;

Move the value 0 into register r0. The # symbol indicates that the value is an immediate value, rather than a register or memory location.

LDR r1, [r2];

Load the value at the memory location pointed to by r2 into r1. The [] symbol indicates that the operand is a memory location, rather than a register or immediate value.

STR r6, [r7, #4];

Store the contents of r6 at the memory location pointed to by r7 + 4.

Control Flow

B foo;

Jump to the label foo. This instruction adds a delta to the current PC.

BX r0;

Jump to the address stored in r0 and change the execution mode. This branch writes a new address value in the PC

BL foo; (function call)

Call the function foo and store the return address in the link register. Before branch execution PC is copied into the link register (LR).

pop {r4,r5,r7,pc}; (function return)

(2) RISC vs CISC

RISC: 除了load/store，没有其他访问内存的指令。指令固定长度，指令很多，但CPU很简单，时钟频率很高。By reducing the number of addressing modes, RISC computer achieves less complexity and higher clock frequencies

Distinct load and store instructions, lacking memory addressing modes for data processing instructions (e.g. ADD), and fixed length instructions all indicate RISC.

CISC: load/store被集成到各种指令中。指令长度可变，指令很少但CPU更复杂。

(3) Thumb Mode

Thumb or ARM Mode

Use the LSB of PC for detection. An even address is seen as an ARM code, and an odd address as Thumb.

Operands

result counts as an operand of the opcode, thus Cortex-M opcodes have three operands result, operand1, operand2

Suffix “S”

Suffix “S” tells the CPU to update the conditional execution flags depending on the result of this operation, i.e. ADDS is ADD with S suffix, only ADDS updates APSR flags.

32-bit Literal

As instructions are only 32-bit long and a few bits are needed to encode the opcode, the 32-bit literal cannot be placed as an immediate in the instruction. Use MOV for the lower 16 bits and MOVT for the upper 16 bits.

Alternatively, we can use the so-called literal pool with LDR r0,=0x12345678. The literal is placed into the text section right after the current function and is loaded from there using the PC with an offset automatically calculated by the assembler.

Function call

Function calls/ Jumps are done using B or BX. Note that the LR needs to be updated to the address of the next instruction after the function call. BL or BLX do that automatically.

5. GNU Debugger

(1) Comparisons of Debug Methods

(2) GDB Overview

GDB can be directly attached to any PC program.

(3) Cheatsheet

到达函数foo()时停止执行 – break foo

执行单行源代码/汇编代码 – step (s) / stepi (si)

当变量Bytes被改变时停止执行 – watch Bytes

暂时继续执行 – continue (c)

删除2号断点 – delete 2

改变布局，同时显示源代码和汇编程序 – layout split

改变光标焦点以扫描命令历史而不是滚动源码 – focus cmd

打印变量计数器/寄存器r3的值 – print counter / print $r3

设置变量计数器为7 – set counter = 7

打印地址为0x08000000的32位的十六进制值 – x /1wx 0x80000000

每次执行停止时显示Bytes值 – display Bytes

显示当前函数的局部变量 – info locals

6. Memory Organization Vulnerabilities

(1) Sections in a regular OS-based system

BSS: uninitialized global data uint_t bla;
data: initialized global data uint32_t bla2 = 0xFEFE;
heap: dynamically allocated data long *foo = calloc(a, sizeof(long));
stack: local variables uint8_t bla3 = 5;

(2) Section Locations

Virtual Memory Address & Load Memory Address

The virtual memory address (VMA) is used by the processor to access a particular location in virtual memory, i.e. to access the data in virtual memory, regardless of whether it is currently stored in RAM or on disk. In a virtual memory system, each program file is assigned a virtual address space, which is a range of memory addresses that are used for the program to access the data. Virtual memory addresses are used in a similar way to physical memory addresses, with the main difference being that the operating system is responsible for mapping virtual addresses to physical addresses when the program or data is accessed.

The load memory address (LMA) is a location in the physical main memory where a particular piece of code or data is loaded. This address is typically specified in the program or data file that the operating system or other software uses to load the program or data into the appropriate location in memory.

1. SRAM

SRAM is volatile, which means that it requires a constant power supply to retain its stored data. The data stored in SRAM will be lost if the power supply is interrupted.

For the data section, the VMA is in SRAM, because the program needs to be able to modify the data.

maximum size of stack

The main stack currently occupies 0x10000000 through 0x10000800, so 2 KiB, which is the maximum size during runtime. It can be made larger in the linker description file, then the maximum is the size of PSRAM, 64 KiB.

maximum size of heap

However, the actual size limit of the heap is defined in the linker description file. Of course, the limit defined there must be small enough such that the heap and all the other sections, e.g. data and bss, all together fit into DSRAM1. Note that this size limit cannot be used entirely for heap storage, because each chunk consumes an additional 4B for its header.

2. FLASH

Flash memory is non-volatile and based on electrically-erasable programmable read-only memory (EEPROM) technology. Non-volatile memory retains its stored data even when the power supply is interrupted

For the data section, the LMA is in FLASH, because the initialization values need to be in some non-volatile memory. Startup code in the boot routine copies initialization values from FLASH to SRAM and clears BSS

(3) Address space

Address space for stack, data and BSS can be read out from *.lst file. Location of the heap cannot be read from it but has to be tried using a debugger and some calloc calls.

Stack cannot crash into heap, but may run out of memory. Because On many platforms the heap and stack are allocated in different pages and never will meet.

(4) Stack Frame

如果参数在堆栈中传递，例如，如果函数有四个以上的参数，它们是caller’s stack frame的一部分，而不是callee’s stack frame。因此，在上图中，函数参数被简要标记为 “previous frame “的一部分。

除此之外，编译器可能会把一个参数的副本放入为local variable保留的区域。例如，如果第一个参数，即在R0中传递的参数，在函数的末尾是需要的，但另一个带有其他参数的函数必须在之前被调用。那么寄存器R0-R3必须被释放，因为其他函数可能会破坏它们，所以编译器必须将我们的第一个参数保存在(可能是)堆栈中。

如果一个较长的string被放进local variable区域，local variable区域上的return address可能会被覆盖，这可能使程序流程改变，因为在当前函数返回时，return address的值将被放入程序计数器中。

(5) Buffer Overflow Attack

参见参考手册第2.3.3节第2-22页关于默认的访问权限。代码、SRAM和外部RAM区域默认都是可执行的。堆栈在PSRAM中，位于Cortex-M4的代码存储器区域，范围是0x00000000到0x1FFFFFFF。所以堆栈默认是可执行的。

用info frames确定缓冲区和返回地址的位置以及exploit要多长才能覆盖return address：

如图，buffer在&buf = 0x100007c0处。返回地址被GDB称为lr，它的位置是0x100007e4。两者之间有36 Bytes，所以我们的exploit需要40 Bytes长来覆盖返回地址。

设计一个exploit：

由于给定的代码只有20B长，我们需要增加16B的padding，然后是新的返回地址，指向exploit代码。exploit代码位于buf的开头，即0x100007c0处。在Thumb模式下，新的返回地址是0x100007c0+1=0x100007c1。

一个可能的exploit是（每个字节由两个HEX数表示）

FD 46 48 F2 01 12 C4 F6 02 02 80 21 D1 73 C9 09 D1 70 FE E7 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF C1 07 00 10

我们也可以预留填充物，将返回地址改为0x100007d1。那么，这个exploit将是：
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff fd 46 48 f2 01 12 c4 f6 02 02 80 21 d1 73 c9 09 d1 70 fe e7 d1 07 00 10

– exploit instruction

– padding

– new return address

Little Endian：

在发送这个文件到电路板之前，需要将其转换为二进制表示

从0x100007c0到0x100007c4处存储的字节是：FD 46 48 F2，转换成32位二进制值是0xF24846FD。

以0x100007c0到0x100007c4处为例，在Little Endian时，MSByte要用*((uint_8 *)&a+3)获得，而在Big Endian时，用*((uint_8 *)&a)获得。

Drawbacks of strcpy()：

如果我们看一下上面的漏洞，它们都在新的返回地址中包含一个00字符。这将导致strcpy()在这一点上终止，并且不写最后一个字节，让新的返回地址指向0x080007c1，这不是我们的exploit所在的位置。所以在这个特定的例子中，用strcpy()在堆栈上执行buffer overflow attack是不可能的。

Find Buffer Overflow Vulnerability

The string givenPW is allocated with length 21, but then in line 8, up to 0x21=33 characters are allowed to be written.

Pro tip: Use a macro or a const variable with a name to hold the size and always use this variable instead of the plain number a.k.a. magic number. That not only avoids such vulnerabilities but also makes your code much more comprehensible.

7. Exceptions and Interrupts

(1) Use Cases

Reaction to events outside of the CPU, e.g. ADC conversion finished

Reaction to outside events is also possible via polling.

Multi-Tasking with termination of hung-up tasks

Multi-Tasking may work without interrupts if all tasks periodically call a context-switching function, but this is highly impractical. In real scenarios, and also if it comes to terminating hung-up tasks, multi-tasking cannot be realized without interrupts. Usually, the SysTickTimer is used to switch context and hand over the CPU to the next task in the queue.

Power Saving

Power saving requires IRQs (interrupt requests) to wake up the CPU after sleep. So this cannot work without interrupts (reset is often considered just a special case of interrupt or exception).

(2) Interrupts vs. Polling

1. According to the problem, a context switch happens once per second. The CPU load for the context switch is . It is obvious that 30 µs + 5 µs < 50 µs.

No flags have to be polled in case of interrupts, but we have to save the current CPU context on the stack and restore it after the ISR (interrupt service routine) finishes. Such a context switch only happens if the IRQ is pending, i.e. after the outside event occurred.

2. The flag needs to be polled with at least The CPU load for the polling is .

We must have finished the subroutine within 50 µs after the event. Considering that processing the subroutine takes 30 µs, the subroutine has to start not more than 20 µs after the event. As the poll itself takes 2 µs, the start of a poll must thus be not more than 18 µs after the last poll started.

Since the flag is not guaranteed to be set within 2 µs after the event happens, we have to assume that a poll is not guaranteed to be successful when the event happens.

3. The CPU load for the subroutine is .

Therefore, the polling-based implementation has an overall CPU load of

And the polling-based implementation has an overall CPU load of

1. Due to the interrupt latency (5 µs) and context switching (4 µs), interrupts cannot be used.

2. Since we poll the GPIO line, what decides is whether the system can poll fast enough (frequency). Polling can achieve the requirements.

Although in this case, the polling loop requires 100% CPU load, it does so only for a very short period of time after the SW initiates a mode change. Thus the average CPU load is not much affected by the polling loop.

(3) ISR (interrupt service routine)

Transparency

In general, an ISR should be transparent to other codes, which means that it should not interfere with the normal operation of the system. I.e. except for the variables intended to be changed by the ISR, everything – including all registers and special registers such as APSR – must be restored to their original value.

Context saving

The registers that are saved automatically upon an IRQ for the XMC4500 include PC, PSR, R0, R1, R2, R3, R12, LR.

Number of arguments

None, because there is no caller that could set the arguments to some meaningful value. But this is not true for exceptions in general.

Access of IRQ

For performance optimization, the compiler might keep a local copy of a variable in a register for repeated access. If ISR updates the original variable in SRAM, the software will continue to use the old value. If it is a wait loop that postpones code execution until, e.g. a certain number of bytes are received by the UART, the system will hang forever.

The keyword volatile can be given to a variable to avoid this issue. The use of the “volatile” keyword tells the compiler that it should not optimize access to the variable, as the value of the variable may change unexpectedly.

8. Memory Protection Unit

MPU defines regions in memory and specifies attributes for them:
r– r–: read-only in privileged mode
rw- rw-: read&write, never execute
rw- r–: read always, but write only in privileged mode
r-x r-x: read&execute, never write

(1) Access to Memory Sections

Peripherals are located at addresses `0x40000000` up to `0x5FFFFFFF`.

The text (code) section is a region of memory that is used to store the executable instructions of a program. For these instructions to be executed, the text section must be marked as executable.

(2) MPU Configuration

XMC4500

Up to 8 regions, each is of size between 32B and 4GB distinguished by priority, i.e. only one region per priority level.

Background region for privileged level with the lowest priority.

Define MPUconfig_t

enum MPUeasyPermissions { MPUeasy_None_None = 0, MPUeasy_RW_None = 1, MPUeasy_RW_R = 2, MPUeasy_RW_RW = 3, MPUeasy_R_None = 5, MPUeasy_R_R = 6}; # define MPUeasyXN (0x1<<4) # define MPUeasyENABLEREGION (0x1<<7) typedef struct { void * baseAddress; int permissions; uint8_t size; uint8_t priority; } MPUconfig_t;

uint8_t size is as power of 2, so e.g. 10=1KiB , 20=1MiB.

“#define” directive is used to define a macro. A macro is a fragment of code that is replaced with a different fragment of code when the program is compiled. Macros can be used to simplify complex code, to improve readability, or to provide a convenient way to reuse code.

Define regions

The proper MPU configuration for the sections mentioned in the previous question looks like that:

MPUconfig_t FLASH = {.baseAddress =( void *) 0x08000000, .size =27, .priority =0, .permissions = MPUeasyENABLEREGION | MPUeasy_R_R };
// 10000000 | 1010 = 10001010
MPUconfig_t PSRAM = {.baseAddress =( void *) 0x10000000, .size =16, .priority =1, .permissions = MPUeasyENABLEREGION | MPUeasy_RW_RW | MPUeasyXN };
// 10000000 | 11 | 10000 = 10010011
MPUconfig_t DSRAM1 = {.baseAddress =( void *) 0x20000000, .size =16, .priority =2, .permissions = MPUeasyENABLEREGION | MPUeasy_RW_RW | MPUeasyXN };
// 10000000 | 11 | 10000 = 10010011

MPUconfig_t Pheriperals = {.baseAddress =( void *) 0x40000000, .size =29, .priority =3, .permissions = MPUeasyENABLEREGION | MPUeasy_RW_RW | MPUeasyXN };
// 10000000 | 11 | 10000 = 10010011

The size for the FLASH region is 27 and not 20 as one would expect for 1 MiB, because the cached access to the FLASH runs via addresses 0x0C000000 up to 0x0C0FFFFF and we want to capture both cached and uncached access to the FLASH.

The “|” operator is a bitwise OR operator. the “.” operator is used to access the members of a structure. It is used to both create and initialize a variable of a structure type.

Although the regions for PSRAM, DSRAM1, and peripherals share the same access permissions, we have to define separate regions for them for two reasons:

First, a single region ranging from 0x10000000 to 0x5FFFFFFF would have size 2^30.3219B which is not an integer power of 2.

A common region for PSRAM and DSRAM1 with size 2²⁹B would have a feasible size, but is not possible for the second reason, namely that region base addresses have to be aligned to the size of the region. A region of size 2²⁹B would need to start at an address that has its 29 lowermost bits equal to zero (>=0x20000000), which is not the case for 0x10000000.

Calling configMPU()

After defining the appropriate regions as MPUconfig_t, we have to program them into the MPU by calling configMPU() on each one. Then we can enable the MPU and drop our privileges. If the program continues to run, we have set up the MPU correctly.

The functions to check the current privilege level and drop privileges are provided by another set of small helper functions in privilege.c.

Note that the Private Peripheral Bus (PPB) is always accessible in privileged mode even if there is no region defined for it and with a disabled background region.

Example

Now a credential store is added to the system in the uppermost 1KiB of DSRAM1, which should be only readable by the task. We change the configuration like:

MPUconfig_t Secret = {.baseAddress =( void *) SECRETSTORE, .size =10, .priority =4, .permissions = MPUeasyENABLEREGION | MPUeasy_RW_R | MPUeasyXN };
// 10000000 | 11 | 10000 = 10010011
MPUconfig_t DSRAM1 = {.baseAddress =( void *) 0x20000000, .size =16, .priority =2, .permissions = MPUeasyENABLEREGION | MPUeasy_RW_RW | MPUeasyXN };
// 10000000 | 10 | 10000 = 10010010

You do not have to exclude the uppermost 1 KiB for the secret store from the DSRAM1 region, because the higher priority of the Secret region will override the permissions of this part of the DSRAM1 region. The priority for Secret can be any priority (0-7) that is yet unused (4-7) and larger than the priority of the DSRAM1 region (>2).

9. Manual Canary

(1) Secure below function using canaries

Using struct

We use a struct to prohibit the compiler from reordering the local variables during alignment:

This is a hypothetic example. In practice, you would not add the canary yourself in a real program but use the -fstack-protector option of your compiler and then the compiler decides if it spends the extra effort to protect exactly the array boundaries of only the control flow information, i.e.the return address.

Using Buffer

Another smart solution is to increase the size of the buffer and make the canary part of the buffer itself. Then no struct is required to prohibit reordering, but it gets somewhat complicated to access the canary.

(2) Properties of Canaries

If the value of the canary can be guessed or tried out by an attacker, she can overwrite the canary with its original value, such that it is not changed. This would render an attack unnoticeable and must thus be made infeasible. The value should therefore fulfill the following properties:

Unpredictable and not readable for the attacker in any way
Large enough to avoid trying out all possible values (brute-force)
Ideally, change upon each program invocation (a change upon each function call would make the program terribly slow)

10. Other Software Attacks

(1) Heap Based Buffer Overflow

Example

#include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { char *buf; // Allocate memory on the heap buf = malloc(10); if (buf == NULL) { perror("malloc failed"); return 1; } // Read input from the user printf("Enter a string: "); fgets(buf, 20, stdin); // Print the input back to the user printf("You entered: %s\n", buf); // Free the allocated memory free(buf); return 0; }

The “fgets()” function is called with a buffer size of 20, which is larger than the size of the allocated buffer (10). This means that the program writes more data to the buffer than it is intended to hold, which can cause a buffer overflow. In this example, the size of the buffer should be passed as the second argument to “fgets()” rather than hardcoded as a constant.

This attack can overwrite the backward and forward pointer.

Use-after-free Bug

Consequence: The memory locations might already be allocated for a different purpose. Reading from them may cause the function to perform unexpected and possibly exploitable actions. Writing to it clobbers data of the other function that the memory locations are now allocated to and may cause this code to malfunction.

Double-free Bug

Countermeasure: The easiest way to do this is to always set the pointer to NULL when it is freed, just like malloc and calloc return a NULL pointer when the allocation failed. According to the C standard, freeing a NULL pointer does no harm.

(2) Format String Attacks

Read password

In line 7, an attacker-controlled string is used as a format string to printf. An attacker may thus add conversions to this string to read correctPW from memory.

We know that r0 to r3 contains the first four arguments of a function, and the return value is placed in r0. Tracing back the code, the last time r1 is used, is to pass correctPW to strcmp(), so correctPW is still in r1 when printf() is called.

Since the “printf()” function in C takes a format string as its first argument(r0) and a variable number of additional arguments(r1, r2, r3, stack ...) that are used to fill in placeholders in the format string. It is thus sufficient for an attacker to provide %s as the givenPW, because printf() will interpret the conversion and print the string pointed to by r1(second argument), which is the correctPW.

Read location of stack

In line 8 a user-controlled string is used as format string. The “sprintf()” function takes a string buffer as its first argument and a format string as its second argument, and a variable number of additional arguments that are used to fill in placeholders in the format string.

A sufficient number of %x or %p will print out the value of r2, r3, and contents in stack including the previous stack frame pointer (under the return address) into the debugString, which will eventually be displayed on the screen.

(3) Integer Underflows

for(uint8_t i = 42; i >= 0; --i);

The programmer intended to loop 43 times by decrementing the variable uint8_t i from 42 to 0. The loop should stop as soon as i becomes negative. The actual behavior of the implementation is an endless loop due to the declaration of the variable i as uint8_t, i.e. as an unsigned 8 bit variable of range 0 to 255. If i=0 is decremented by one, the result will be 255. This issue is called integer underflow.

(4) SQL Injection

Code Injection requires that data is treated as code so that it can contain variables.

userpass=sqlInt.execute("SELECT␣password␣FROM␣users␣WHERE␣username ␣=␣’" + userName + " ’;");

The variable userName is taken directly from the input of the login form. The SQL statement retrieves the password corresponding to the entered username and checks it against the array userpass.

The statements after the WHERE keyword filter what is retrieved from the database. So we need to disable the filter or make it always true:

userpass = " SELECT ␣ password ␣ FROM ␣ users ␣ WHERE ␣ username ␣=␣ ’’␣OR␣’1’=’1’;"

The ’1’=’1’ statement is always true.

(5) Cross Site Scripting(XSS)

XSS requires that server stores user data and displays it on its webpages to others e.g. comment fields in online shop, forum entries, etc. and server does not check data for statements interpreted by a browser.

Popular not-so-harmful example: Alert box using javascript:

Javascript can also do other things, like stealing a session cookie and sending it to the attacker, which then can impersonate the victim.

When is there a risk of code injection?

Whenever code and data is only weakly separated, e.g. in von Neumann architectures or scripting languages.

11. Security and Cryptography

(1) Security Objectives (CIA)

Objectives	Measure
Confidentiality	Access Control / Encryption
Integrity	Write Protection / Crypto Signature
Availiability	Redundancy
Accountability	Logging
Authenticity	Password / Crypto Signature
Privacy	Data Minimization / Pseudonyms

(2) Crypto Algorithm Overview

Symmetric Cryptography

Block Ciphers	Description	Block Size / bit	Key Size / bit
DES	proven weak	64	56
IDEA	international data encryption algorithm	64	128
AES	advanced encryption algorithm	128	128 / 192 / 256
SPECK	linear cipher, light weight	32 – 128	64 – 256

Stream Cipher	Key Size / bit
RC4(weak)	8 – 2048
Salsa20	256

Asymmetric Cryptography

Cryptography	Key Size
RSA	1024(weak), 4096
DSA(digital signature)	1024
ECDSA	160

Cryptographic Hash Functions

Hash Functions	Output Size / bit
MD5(weak)	128
SHA-1(weak)	160
SHA-2	224 – 512
SHA-3	224 – 512

(3) Confusion & Diffusion

Confusion of mapping

The relation between plaintext and ciphertext shall be highly complex, known pairs of plaintext and ciphertext shall not allow recovering the key.

Diffusion of entropy

Every bit of input to the cryptographic algorithm shall affect all bits of the output.

(4) Attacks

Block Cipher

Instead of trying brute-force which is infeasible, one can build up a look-up table and perform a search-and-replace attack.

Ideal Stream Cipher

A brute-force attack is not possible. Because all keys are equally probable and a certain ciphertext could equally probably represent any plaintext.

One-time Pad

The problem with OTP is that it requires a secure method for generating and distributing the keys, as well as securely storing them. The key must be kept secret and be as long as the plaintext. This is a difficult task to accomplish in practice, especially for large amounts of data or for long-term storage.

Another issue with OTP is that it is not very efficient, as the same amount of data must be encrypted as the plaintext, which can lead to large key sizes and slow encryption and decryption times.

Asymmetric Cryptography

An IoT node communicates with multiple webservers via TLS, a hybrid cryptography protocol. If the IoT node wants to communicate, it first exchanges public keys with the respective server, then establishes a signed and encrypted channel with them.

If an attacker redirects the entire traffic to his servers, the IoT won’t know and still accepts the attacker’s public key. To avoid such attacks, the IoT node could have a list of public keys for the trusted servers.

To avoid such attacks, real-world asymmetric cryptography never uses raw public keys, but certificates, which are basically just signed public keys for a certain URL, email address, etc. Whether to trust the public key can then be decided depending on who signed it.
The webbrowser has a built-in list of so called certificate authorities (CAs), whose business model is to sign public keys for money. If you now browse a website that uses TLS, your browser will check if the certificate returned by the server contains a signature of one of these trusted CAs (often via a couple of intermediate certifications) and only if so accept the public key.

In this problem’s scenario, the public key returned by the attacker’s server will not be accompanied by a signature of one of these CAs and thus the browser will display a warning that the public key could not be authenticated and it is likely that you are currently under attack.

12. Side Channel Attacks

(1) Types

There is physical quantity related to the operation of a cryptosystem but not intended to carry information.

Time
Power
Electromagnetic emanations
Acoustic emanations
Temperature
Light

(2) Timing of Password Check

strcmp terminates upon the first character that differs. So by observing how long it takes to check the password, one can determine how many characters – at the beginning of the string – are correct.

The cryptographic library sodium provides a more secure function for comparing data: sodium_memcmp

(3) Power Trace of Square-Multiply Algorithm

13. Embedded Communication

(1) Embedded Communication Standards

RS-232

I2C(Inter-Integrated Circuit)

SPI(Serial Peripheral Interface)

It is a full-duplex communication protocol, which means that data can be transmitted and received simultaneously on separate lines. It is used for communication between integrated circuits.

Ethernet

一种广泛使用的局域网（LAN）协议，支持10 Mbps至100 Gbps的数据传输速率。It is a standard for connecting computers and other devices in a LAN and is also used in wide area network (WAN) connections. Ethernet is based on the use of a shared medium, typically a wired cable, to transmit data between devices.

UART(Universal Asynchronous Receiver Transmitter)

It allows devices to transmit and receive data serially (i.e., one bit at a time) over a single communication line or channel. UARTs are commonly used in embedded systems, such as microcontrollers, to communicate with other devices, such as sensors, memory, and other peripherals.

(2) USB (Universal Serial Bus)

It is a standard for connecting devices to a computer or other host. It is a serial bus that provides a standard interface for connecting a wide variety of peripherals, such as keyboard, mouse, cameras, printers, and external hard drives.

USB connection sequence

Attach: The host detects the connection and sends a reset signal to the device.
Read Device Descriptor: The host queries the device for its identity and configuration information. The device responds with its device descriptor, which contains information such as its vendor ID, product ID, and supported USB version.
Assign Address: The host assigns a unique address to the device and the device uses that address for all subsequent communications with the host.
Configuration: The host selects a configuration for the device and sends a configuration request. The device responds by setting its configuration and sending a configuration descriptor, which contains information such as the device’s power requirements and the number of interfaces.
Read Interface Descriptor: The host selects an interface and sets up the endpoints. The device responds by setting up the interface and endpoints and sending an endpoint descriptor, which contains information such as the endpoint’s maximum packet size and transfer type.
Load Driver: A USB driver is a software component that allows a computer’s operating system to communicate with a USB device. Drivers act as a translator between the operating system and the device, allowing the operating system to recognize and control the device.

1. Embedded Programs

(1) What “Embedded Programs” Mean

(2) Retrace Build Steps

(3) Difference from Computer Programs

2. XMC4500 Board

(1) Functional Blocks

(2) Peripherals

(3) Accessing Peripherals

3. Cortex M4

(1) Functional Blocks

(2) Registers

1. Registers for Function Arguments

2. Caller/Callee-saved Registers

3. Special Registers

4. PSR (Program status registers)

4. Assembler

(1) Instructions

(2) RISC vs CISC

(3) Thumb Mode

5. GNU Debugger

(1) Comparisons of Debug Methods

(2) GDB Overview

(3) Cheatsheet

6. Memory Organization Vulnerabilities

(1) Sections in a regular OS-based system

(2) Section Locations

1. SRAM

2. FLASH

(3) Address space

(4) Stack Frame

(5) Buffer Overflow Attack

7. Exceptions and Interrupts

(1) Use Cases

(2) Interrupts vs. Polling

(3) ISR (interrupt service routine)

8. Memory Protection Unit

(1) Access to Memory Sections

(2) MPU Configuration

9. Manual Canary

(1) Secure below function using canaries

(2) Properties of Canaries

10. Other Software Attacks

(1) Heap Based Buffer Overflow

(2) Format String Attacks

(3) Integer Underflows

(4) SQL Injection

(5) Cross Site Scripting(XSS)

11. Security and Cryptography

(1) Security Objectives (CIA)

(2) Crypto Algorithm Overview

(3) Confusion & Diffusion

(4) Attacks

12. Side Channel Attacks

(1) Types

(2) Timing of Password Check

(3) Power Trace of Square-Multiply Algorithm

13. Embedded Communication

(1) Embedded Communication Standards

(2) USB (Universal Serial Bus)

留下评论 取消回复

留下评论取消回复