[C-prog-lang-l] fgrep redux and error detection

Vladimír Kotal vlada at kotalovi.cz
Thu Mar 13 10:20:52 CET 2025


Apropos, speaking of grep implementations, there is a new project called krep (https://davidesantangelo.github.io/krep/) which is fixed grep implementation in C however using various techniques for speeding up the search (multithreading, CPU extensions, dynamic algorithm switching, memory mapped files, etc.). While these topics are out of scope of our lecture, it might be interesting for you to take a look at the code which arguably takes a simple task to the extremes. After all, reading code written by others is another angle in language learning.

Cheers,


V. Kotal

On Thu, Mar 13, 2025, at 09:42, Vladimír Kotal wrote:
> Hi all,
> 
> just a follow up to the fgrep exercise from the last lecture: another alternative to Valgrind for stack/heap based corruption detectin is to use ASan (https://github.com/google/sanitizers/wiki/AddressSanitizer) which will detect and display the problem in high level of detail. The ASan is built into newer GCC versions, for example, and can be activated with a command line switch. This obviously slows down the runtime (just like Valgrind albeit maybe not so much) so it may not be suitable for production workloads.
> 
> Here is again the fgrep solution with char array reduced to 16 bytes and the maximum line check ifdef'd out:
> 
> $ gcc -fsanitize=address -fno-omit-frame-pointer -g -O0 src/fgrep.c 
> $ cat /etc/passwd | ./a.out 
> =================================================================
> ==536258==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe63cf7bb0 at pc 0x564ed305541e bp 0x7ffe63cf7b30 sp 0x7ffe63cf7b20
> WRITE of size 1 at 0x7ffe63cf7bb0 thread T0
>     #0 0x564ed305541d in main src/fgrep.c:20
>     #1 0x7f4b42a3cd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #2 0x7f4b42a3ce3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #3 0x564ed30551a4 in _start (/home/vkotal/MFF/C/c-prog-lang-teacher-notes/a.out+0x11a4)
> 
> Address 0x7ffe63cf7bb0 is located in stack of thread T0 at offset 80 in frame
>     #0 0x564ed3055278 in main src/fgrep.c:5
> 
>   This frame has 2 object(s):
>     [32, 37) 'needle' (line 7)
>     [64, 80) 'line' (line 12) <== Memory access at offset 80 overflows this variable
> HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
>       (longjmp and C++ exceptions *are* supported)
> SUMMARY: AddressSanitizer: stack-buffer-overflow src/fgrep.c:20 in main
> Shadow bytes around the buggy address:
>   0x10004c796f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796f60: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
> =>0x10004c796f70: 05 f2 f2 f2 00 00[f3]f3 00 00 00 00 00 00 00 00
>   0x10004c796f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10004c796fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07 
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
> ==536258==ABORTING
> 
> 
> As you can see it reports not only the line numbers (thanks to the information added via -g) but also variable names+offsets where the problem happened.
> 
> Best regards,
> 
> 
> V. Kotal
> _______________________________________________
> c-prog-lang-l mailing list
> c-prog-lang-l at mff.cuni.cz
> http://mbox.ms.mff.cuni.cz/listserv/listinfo/c-prog-lang-l
> 
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the c-prog-lang-l mailing list