1 / 13

TA Session 12 x86-SSE text string processing instructions

Computer Architecture and System Programming Laboratory. TA Session 12 x86-SSE text string processing instructions. X86-SSE Programming – Text Strings (SSE4.2) An implicit-length text string uses a terminating End-Of-String (EOS) character.

tlongoria
Télécharger la présentation

TA Session 12 x86-SSE text string processing instructions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture and System Programming Laboratory TA Session 12 x86-SSE text string processing instructions

  2. X86-SSE Programming – Text Strings (SSE4.2) An implicit-length text string uses a terminating End-Of-String (EOS) character. X86-SSE includes four SIMD text string instructions that are capable of processing text string fragments up to 128 bits in length. Suppose you are given a text string fragment and want to create a mask to indicate the positions of the uppercase characters within the string. For example, each 1 in the mask 1000110000010010b signifies an uppercase character in the corresponding position of the text string "Ab1cDE23f4gHi5J6". The desired character range and text string fragment are loaded into registers XMM1 and XMM2, respectively.

  3. RFLAGS: 0x4831 = 0100100000110001b

  4. the output format bit 6 is set, which means that the mask value is expanded to bytes RFLAGS: multiple character ranges XMM1 contains two range pairs: one for uppercase letters and one for lowercase letters. RFLAGS: • text string fragment that includes an embedded EOS (‘\0’) character • ZF is set to 1 • final mask value excludes matching range characters following EOS RFLAGS:

  5. multiple character ranges XMM1 contains two range pairs: one for uppercase letters and one for lowercase letters. RFLAGS: RFLAGS is set in a non-standard manner in order to supply the most relevant information: CF flag – Reset if IntRes2 is equal to zero, set otherwise ZF flag – Set if any byte/word of xmm2/mem128 is null, reset otherwise SF flag – Set if any byte/word of xmm1 is null, reset otherwise OF flag – IntRes2[0] AF flag – Reset PF flag – Reset

  6. MOVDQU xmm1, xmm2/m128 Move unaligned double quadword from xmm2/m128 to xmm1. section .data str: db ‘Ab1cDE23f4gHi5J6’ AZ_mask: db ‘A', ‘Z’ times 14 db 0 imm: equ 01000100b AZ2az_mask: times 16 db ('a' - 'A’) result: times 16 db 0 db `\n\0` extern printf section .text global main main: enter movdqu xmm1, [AZ_mask] movdqu xmm2, [str] pcmpistrm xmm1, xmm2, imm movdqu xmm3, [AZ2az_mask] pand xmm0, xmm3 paddb xmm2, xmm0 movdqu [result], xmm2 mov rdi, result mov rax, 0 call printf leave ret PADDB xmm1, xmm2/m128 Add packed byte integers from xmm2/m128 and xmm1. PAND xmm1, xmm2/m128 Bitwise AND of xmm2/m128 and xmm1.

  7. Equal any (imm[3:2] = 00). The result is a bit mask – 1 if the character belongs to a set, 0 if not. pcmpstrim xmm1, xmm2, 01000000b xmm1 xmm2 xmm0 Equal each (imm[3:2] = 10). The result is a bit mask – 1 if the corresponding bytes are equal, 0 if not equal. pcmpstrim xmm1, xmm2, 01001000b xmm1 xmm2 xmm0

  8. Equal ordered (imm[3:2] = 11). The result is a bit mask – 1 if the substring is found at the corresponding position, 0 otherwise. pcmpstrim xmm1, xmm2, 01001100b xmm1 xmm2 xmm0

  9. rcx RFLAGS: IntRes1 calculation – mask according to the given range Negative- IntRes2 calculation RCX = index of least significant set bit in IntRes2 RCX = 16 (invalid index) RCX RCX

  10. IntRes1 calculation – mask according to the given range Negative- IntRes2 calculation RFLAGS: rcx RCX = index of least significant set bit in IntRes2 RCX = 11 (index of ‘\0’ character, or length of string) RCX RCX

  11. rcx RFLAGS: RCX RCX

  12. first loop cycle: section .data str: db ‘Ab1cDE23f4gHi5J6’ db ‘Ab1cDE23f4g\0’ EOS_mask: db 0x1,0xFF times 14 db 0 imm: equ 00010100b section .text global strlen strlen: enter xorrax xor rcx movdqu xmm1, [EOS_mask] .loop add rax, rcx pcmpistri xmm1, [str+rax], imm jnz .loop add rax, rcx leave ret RFLAGS: rcx second loop cycle: RFLAGS: rcx

  13. RCX RCX

More Related