x86 Assembler using MASM32 Tutorial 12 - String Concatenation, Joining Two Strings



Watch the video or follow the tutorial.



In this tutorial we will take two separate strings in memory and add then together to form a third string that contains both the original strings. So, first of all open Visual MASM, if you haven't installed Visual MASM yet you can learn how to do it here. Once installed open it and learn create the default code template needed for all my tutorials here. This default code will be the starting point for most of our tutorials including this one. So, once all this is done you should be looking at the code window as seen below:



Scroll down to the .data section and add the following eight variables:

strString1 DB "Assembler ", 0
len1 Equ $ - strString1
strString2 DB "is cool like", 0
len2 Equ $ - strString2
strMessage DB "MessageBox Title", 0


Then scroll down to the uninitialised data section which starts with .data? and add the following code:

strFinalString DB len1 + len2 - 1 DUP(?)

In the .code section after start: the following code should be added

; set Esi to memory address of strString1
; set Ecx (for Rep) to strString1 length not including 0 string terminator
; set Edi to address of strFinalString
Mov Esi, Offset strString1
Mov Ecx, len1 - 1
Mov Edi, Offset strFinalString

; Repeat move bytes from strString1 to strFinalString until Ecx is zero, Ecx is decrementewd after each byte is copied
Rep Movsb

; set Esi to memory address of strString2
; set Ecx (for Rep) to strString2 length
; set Edi to address in strFinalString of end of strString1
Mov Esi, Offset strString2
Mov Ecx, len2
Mov Edi, Offset strFinalString + len1 - 1

; Repeat move bytes from strString2 to strFinalString until Ecx is zero, Ecx is decrementewd after each byte is copied
Rep Movsb

; MessageBox First String
Push MB_OK
Push Offset strMessage
Push Offset strString1
Push 0
Call MessageBox

; MessageBox Second String
Push MB_OK
Push Offset strMessage
Push Offset strString2
Push 0
Call MessageBox

; MessageBox Combined String
Push MB_OK
Push Offset strMessage
Push Offset strFinalString
Push 0
Call MessageBox


The Rep Movsb command which means move each byte from a string to another location in memory byte by byte until end of string is reached. This command uses the Esi, Ecx and Edi registers. The memory address of the first string is loaded into the Esi register and the length of the first string minus the zero termination byte is loaded into the Ecx register, this is basically the number of bytes to copy from the string in Esi and we do not want to copy the ending zero. The location in memory where the bytes will be copied to is the strFinalString variable and the address of its first byte is loaded into the Edi register. We then use the Rep Movsb command to copy the string strString1 into the string strFinalString.

The same process is then repeated for strString2 except the memory address loaded into the Edi register is the address of the last byte written from the previous command so that the strings are sequential in memory. This time we also include the zero termination byte in Ecx. The Rep Movsb command then runs and moves the bytes from strString2 into strFinalString.

Three MessageBoxes are then displayed, one showing the first string, one showing the second string and one showing the combined final string. Congratulations you have just concatenated two strings in x86 assembler.

The entire code looks like this:

; *************************************************************************
; 32-bit Windows Program
; *************************************************************************

.686                                      ; Enable 80686+ instruction set
.model flat, stdcall                ; Flat, 32-bit memory model (not used in 64-bit)
option casemap: none         ; Case sensitive syntax

; *************************************************************************
; MASM32 proto types for Win32 functions and structures
; *************************************************************************
include c:\masm32\include\windows.inc
include c:\masm32\include\user32.inc
include c:\masm32\include\kernel32.inc
include c:\masm32\include\masm32rt.inc     ; for using ustr$() and such like

; *************************************************************************
; MASM32 object libraries
; *************************************************************************
includelib c:\masm32\lib\user32.lib
includelib c:\masm32\lib\kernel32.lib

; *************************************************************************
; Our data section.
; *************************************************************************
.data

strString1 DB "Assembler ", 0
len1 Equ $ - strString1
strString2 DB "is cool like", 0
len2 Equ $ - strString2
strMessage DB "MessageBox Title", 0

; *************************************************************************
; Our unintialised data section.
; *************************************************************************
.data?

strFinalString DB len1 + len2 - 1 DUP(?)

; *************************************************************************
; Our constant section.
; *************************************************************************
.const



; *************************************************************************
; Macros
; *************************************************************************



; *************************************************************************
; Our executable assembly code starts here in the .code section
; *************************************************************************
.code

start:

    ; set Esi to memory address of strString1
    ; set Ecx (for Rep) to strString1 length not including 0 string terminator
    ; set Edi to address of strFinalString
    Mov Esi, Offset strString1
    Mov Ecx, len1 - 1
    Mov Edi, Offset strFinalString

    ; Repeat move bytes from strString1 to strFinalString until Ecx is zero, Ecx is decrementewd after each byte is copied
    Rep Movsb

    ; set Esi to memory address of strString2
    ; set Ecx (for Rep) to strString2 length
    ; set Edi to address in strFinalString of end of strString1
    Mov Esi, Offset strString2
    Mov Ecx, len2
    Mov Edi, Offset strFinalString + len1 - 1

    ; Repeat move bytes from strString2 to strFinalString until Ecx is zero, Ecx is decrementewd after each byte is copied
    Rep Movsb

    ; MessageBox First String
    Push MB_OK
    Push Offset strMessage
    Push Offset strString1
    Push 0
    Call MessageBox

    ; MessageBox Second String
    Push MB_OK
    Push Offset strMessage
    Push Offset strString2
    Push 0
    Call MessageBox

; MessageBox Combined String
    Push MB_OK
    Push Offset strMessage
    Push Offset strFinalString
    Push 0
    Call MessageBox

    Push 0
    Call ExitProcess

end start


When you run the program you should see the three MessageBoxes that display the three strings.

Thank you for following this tutorial, I hope you found it useful. In the next tutorial we will have a look at creating a simple window in x86 assembler, so until then, enjoy.


Link to a text file with complete source code and more comments - here.