stristr, case insensitive string searchDownload source code and demo - 27 Kb
IntroductionI can't count how many times I have rambled over the lack of a case-insensitive implementation of A nice benefit is that not only the function can be used as is, it can now also be added to a MFC CString-derived class for instance, P/Invoked in a C# code, and so on. Anytime I had to perform a case-insensitive lookup in the past, I would do one of those following 2 kind of horrible things (and that's a euphemism) only because of the absence of such function or method :
All of this is OVER. What's odd is that not only the C run-time doesn't bring such function to the game, there is no such equivalent either in the MFC CString class, nor in the STL basicstring, nor in the C# String class. Simply put, amazing!
x86 assembly source codeWith this implentation of Here we go with the code : #pragma warning(disable : 4035) // stristr ///////////////////////////////////////////////////////// // // performs a case-insensitive lookup of a string within another // (see C run-time strstr) // // str1 : buffer // str2 : string to search for in the buffer // // example char* s = stristr("Make my day","DAY"); // // S.Rodriguez, Jan 11, 2004 // char* stristr(char* str1, char* str2) { __asm { mov ah, 'A' mov dh, 'Z' mov esi, str1 mov ecx, str2 mov dl, [ecx] test dl,dl ; NULL? jz short str2empty_label outerloop_label: mov ebx, esi ; save esi inc ebx innerloop_label: mov al, [esi] inc esi test al,al je short str2notfound_label ; not found! cmp dl,ah ; 'A' jb short skip1 cmp dl,dh ; 'Z' ja short skip1 add dl,'a' - 'A' ; make lowercase the current character in str2 skip1: cmp al,ah ; 'A' jb short skip2 cmp al,dh ; 'Z' ja short skip2 add al,'a' - 'A' ; make lowercase the current character in str1 skip2: cmp al,dl je short onecharfound_label mov esi, ebx ; restore esi value, +1 mov ecx, str2 ; restore ecx value as well mov dl, [ecx] jmp short outerloop_label ; search from start of str2 again onecharfound_label: inc ecx mov dl,[ecx] test dl,dl jnz short innerloop_label jmp short str2found_label ; found! str2empty_label: mov eax, esi // empty str2 ==> return str1 jmp short ret_label str2found_label: dec ebx mov eax, ebx // str2 found ==> return occurence within str1 jmp short ret_label str2notfound_label: xor eax, eax // str2 nt found ==> return NULL jmp short ret_label ret_label: } } #pragma warning(default : 4035)
Using it in your own source codeThe demo code provided in the zip package contains a benchmark suite that find occurences of a given string in an html file using a couple ways. The benchmark purpose is only to know if the additional code branches in I also provide a reference C-based case insensitive search. Since it's not written with assembly code, it won't be compiled into code that maximizes the use of registry rather than the data segment. But this is how CPU with a great amount of L2 cache would come into play.
Enjoy! Stephane Rodriguez- February 6, 2004. |
Home Blog |