For embedded developers and hardware hackers, JTAG is the de facto standard for debugging and accessing microprocessor registers. This protocol has been in use for many years and is still in use today. Its main drawback is that it uses a lot of signals to work (at least 4 – TCK, TMS, TDI, TDO). This has become a problem now that devices have gotten smaller and smaller and low pin count microcontrollers are available.
To address this, ARM created an alternative debug interface called SWD (Serial Wire Debug) that only uses two signals (SWDCLK and SWDIO). This interface and its associated protocol are now available in nearly all Cortex-[A,R,M] processors.
ARM Debug Interface
Contrary to JTAG, which chains TAPs together, SWD uses a bus called DAP (Debug Access Port). On this DAP, there is one master (the DP – Debug Port) and one or more slaves (AP – Access Ports), similar to JTAG TAPs. The DP communicates with the APs using packets that contain the AP address.
To sum this up, an external debugger connects to the DAP via the DP using a protocol called SWD. This whitepaper from ARM shows a nice overview of the SWD architecture :
The Debug Port is the interface between the host and the DAP. It also handles the host interface. There are three different Debug Ports available to access the DAP :
- JTAG Debug Port (JTAG-DP). This port uses the standard JTAG interface and protocol to access the DAP
- Serial Wire Debug Port (SW-DP). This port uses the SWD protocol to access the DAP.
- Serial Wire / JTAG Debug Port (SWJ-DP). This port can use either JTAG or SWD to access the DAP. This is a common interface found on many microcontrollers. It reuses the TMS and TCK JTAG signals to transfer the SWDIO and SWDCLK signals respectively. A specific sequence has to be sent in order to switch from one interface to the other.
Multiple APs can be added to the DAP, depending on the needs. ARM provides specifications for two APs :
- Memory Access Port (MEM-AP). This AP provides access to the core memory aand registers.
- JTAG Access Port (JTAG-AP). This AP allows to connect a JTAG chain to the DAP.
As said earlier, SWD uses only two signals :
- SWDCLK. The clock signal sent by the host. As there is no relation between the processor clock and the SWD clock, the frequency selection is up to the host interface. In this KB article, the maximum debug clock frequency is about 60MHz but varies in practice.
- SWDIO. This is the bidirectional signal carrying the data from/to the DP. The data is set by the host during the rising edge and sampled by the DP during the falling edge of the SWDCLK signal.
Both lines should be pulled up on the target.
Each SWD transaction has three phases :
- Request phase. 8 bits sent from the host.
- ACK phase. 3 bits sent from the target.
- Data phase. Up to 32 bits sent from/to the host, with an odd parity bit.
Note that a Trn cycle has to be sent when the data direction has to change.
The request header contains the following fields :
|Start||Start bit. Should be 1|
|APnDP||Access to DP(0) or AP(1)|
|RnW||Write(0) or Read(1) request|
|A[2:3]||AP or DP register address bits[2:3]|
|Parity||Odd parity over (APnDP, RnW, A[2:3])|
|Stop||Stop bit. Should be 0|
|Park||Park bit sent before changing SWDIO to open-drain. Should be 1|
The ACK bits contain the ACK status of the request header. Note that the three bits must be read LSB first.
|2||OK response. Operation was successful|
|1||WAIT response. Host must retry the request.|
|0||FAULT response. An error has occurred|
The data is sent either by the host or the target. It is sent LSB first, and ends with an odd parity bit.
Now that we know more about the low-level part of the protocol, it’s time to interact with an actual target. In order to do so, I used a Hydrabus but this can also be done using a Bus Pirate or any other similar tool. During this experiment, I used a STM32F103 development board, nicknamed Blue Pill. It is easily available and already has a SWD connector available.
The ARM Debug Interface Architecture Specification document contains all the details needed to interact with the SWD interface, so let’s get started.
As the target uses an SWJ-DP interface, it needs to be switched from the default JTAG mode to SWD. The chapter 5.2.1 of the document shows the sequence to be sent to switch from JTAG to SWD :
1. Send at least 50 SWCLKTCK cycles with SWDIOTMS HIGH. This ensures that the current interface is in its reset state. The JTAG interface only detects the 16-bit JTAG-to-SWD sequence starting from the Test-Logic-Reset state.
2. Send the 16-bit JTAG-to-SWD select sequence on SWDIOTMS.
3. Send at least 50 SWCLKTCK cycles with SWDIOTMS HIGH. This ensures that if SWJ-DP was already in SWD operation before sending the select sequence, the SWD interface enters line reset state.
The sequence being 0b0111 1001 1110 0111 (0x79e7) MSB first, we need to use 0x7b 0x9e in LSB-first format.
import pyHydrabus r = pyHydrabus.RawWire('/dev/ttyACM0') r._config = 0xa # Set GPIO open-drain / LSB first r._configure_port() r.write(b'\xff\xff\xff\xff\xff\xff\x7b\x9e\xff\xff\xff\xff\xff\xff)
Now that the DP is in reset state, we can issue a DPIDR read command to identify the Debug Port. To do so, we need to read DP register at address 0x00
| Start | APnDP | RnW | A[2:3] | Parity | Stop | Park | |-------|-------|-----|--------|--------|------|------| | 1 | 0 | 1 | 0 0 | 1 | 0 | 1 | = 0xa5
r.write(b'\x0f\x00\xa5') status = 0 for i in range(3): status += ord(r.read_bit())<<i print("Status: ",hex(status)) print("DPIDR", hex(int.from_bytes(r.read(4), byteorder="little")))
Next step is to power up the debug domain. Chapter 2.4.5 tells us that we need to set CDBGRSTREQ and CDBGRSTACK (bits 28 and 29) in the CTRL/STAT (address 0x4) register of the DP :
r.write(b'\x81') # Write request to DP register address 0x4 for _ in range(5): r.read_bit() # Do not take care about the response # Write 0x00000078-MSB in the CTRL/STAT register r.write(b'\x1e\x00\x00\x00\x00') # Send some clock cycles to sync up the line r.write(b'\x00')
Now that the debug power domain is up, the DAP is fully accessible. As a first discovery process, we will query an AP, then scan for all APs in the DAP.
Reading from an AP
Reading from an AP is always done via the DP. To query an AP, the host must tell the DP to write to an AP specified by an address on the DAP. To read data from a previous transaction, the DP uses a special register called RDBUFF (address 0xc). This means that the correct query method is the following :
- Write to DP SELECT register, setting the APSEL and APBANKSEL fields.
- Read the DP RDBUFF register once to “commit” the last transaction.
- Read the RDBUFF register again to read its actual value.
The SELECT register is described on chapter 2.3.9, the interesting fields are noted here :
|APSEL||[31:24]||Selects the AP address.
There are up to 255 APS on the DAP.
|APBANKSEL||[7:4]||Selects the AP register to query. In our case,
we will query the IDR register to identify the
One interesting AP register to read is the IDR register (address 0xf), which contains the identification information for this AP. The code below sums up the procedure to read IDR of AP at address 0x0.
ap = 0 # AP address r.write(b'\xb1') # Write to DR SELECT register for _ in range(5): r.read_bit() # Don't read the status bits r.write(b'\xf0\x00\x00') # Fill APBANKSEL with 0xf r.write(ap.to_bytes(1, byteorder="little")) # Fill APSEL with AP address # This calculates the parity bit to be sent after the data phase if(bin(ap).count('1')%2) == 0: r.write(b'\x00') else: r.write(b'\x01') r.write(b'\x9f') # Read RDBUFF from DP status = 0 for i in range(3): status += ord(r.read_bit())<<i # Read transaction status print("Status: ",hex(status)) #Dummy read #print("dummy", hex(int.from_bytes(r.read(4), byteorder="little"))) r.read(4) r.write(b'\x00') r.write(b'\x9f') # Read RDBUFF from DP, this time for real status = 0 for i in range(3): status += ord(r.read_bit())<<i print("Status: ",hex(status)) idcode = hex(int.from_bytes(r.read(4), byteorder="little")) #Read actual value if idcode != '0x0': # If no AP present, value will be 0 print("AP", hex(ap), idcode) r.write(b'\x00')
Scanning for APs
With the exact same code, we can iterate on the whole address space and see if there are any other APs on the DAP :
for ap in range(0x100): r.write(b'\x00') r.write(b'\xb1') for _ in range(5): r.read_bit() #r.write(b'\xf0\x00\x00\x00\x00') r.write(b'\xf0\x00\x00') r.write(ap.to_bytes(1, byteorder="little")) if(bin(ap).count('1')%2) == 0: r.write(b'\x00') else: r.write(b'\x01') r.write(b'\x9f') status = 0 for i in range(3): status += ord(r.read_bit())<<i #print("Status: ",hex(status)) #print("dummy", hex(int.from_bytes(r.read(4), byteorder="little"))) r.read(4) r.write(b'\x00') r.write(b'\x9f') status = 0 for i in range(3): status += ord(r.read_bit())<<i #print("Status: ",hex(status)) idcode = hex(int.from_bytes(r.read(4), byteorder="little")) if idcode != '0x0': print("AP", hex(ap), idcode)
Running the script shows that there is only one AP on the bus. According to the documentation, it is the MEM-AP :
> python3 /tmp/swd.py Status: 0x1 DPIDR 0x2ba01477 AP 0x0 0x24770011
From here, is is possible to send commands to the MEM-AP to query the processor memory.
Discovering SWD pins
On real devices, it is not always easy to determine which pins or testpoints are used for the debug interface. It is also true for JTAG, this is why tools like the JTAGulator exist. Its purpose is to discover JTAG interfaces by trying every pin combination until a combination returns a valid IDCODE.
Now that we know better how a SWD interface is initialized, we can do about the same but for SWD interfaces. The idea is is the following :
- Take a number of interesting pins on a target board
- Wire them up on the SWD discovery device
- Select two pins on the SWD discovery device as SWDCLK and SWDIO
- Send the SWD initialization sequence.
- Read the status response and the DPIDR register
- If valid results, print the solution
- If no valid results, go to step 3 and select two new pins
This method has been implemented for the Hydrabus firmware, and so far brings positive results. An example session is displayed here :
> 2-wire Device: twowire1 GPIO resistor: floating Frequency: 1000000Hz Bit order: MSB first twowire1> brute 8 Bruteforce on 8 pins. Device found. IDCODE : 2BA01477 CLK: PB5 IO: PB6 twowire1>
The operation takes less than two seconds, and reliably discovered SWD interfaces on all the tested boards so far.
In this post we showed how the ARM debug interface is designed, and how the SWD protocol is working at a very low level. With this information, it is possible to send queries to the MEM-AP using a simple microcontroller. This part goes far beyond this post purpose and will not be covered here. The PySWD library is a helpful resource to start interacting with the MEM-AP.
We also showed how to implement a SWD detection tool to help finding SWD ports, similar to existing tools used for JTAG detection.