文章总结: 本文记录了利用AI辅助对IoT设备GoAhead服务进行模糊测试的全实战流程。作者通过Prompt指导AI完成了nvram_hook开发、Harness编写、QEMUfakechroot补丁生成、Fuzz字典定制及Crash样本去重。重点展示了AI如何辅助逆向分析并生成漏洞报告。结论表明大模型已能高度替代人工完成Fuzz各环节的开发与分析,显著提升了安全研究效率。 综合评分: 91 文章分类: IoT安全,二进制安全,漏洞分析,实战经验,逆向分析
原创 Paper | 使用 AI 辅助 fuzz 实战记录
原创
404实验室
知道创宇404实验室
2025年12月31日 11:26 韩国
作者:知道创宇404实验室
在走通之前的fuzz方案后,开始对另一款IoT设备的goahead服务进行fuzz,步骤思路跟之前一样,唯一的区别是,全程尝试让AI作为主力,我们作为AI的辅助和操纵者。
准备工作
参考资料
我使用的AI工具为Gemini cli + VSCode。因为Gemini需要访问到IDA,因此还需要在当前目录编写一个GEMINI.md文件,内容为:
你可以使用用户电脑上的IDA Pro程序来分析二进制程序,该版本的IDA支持无GUI,使用python来加载二进制程序,然后使用IDAPython进行相应的分析。使用方法如下所示:
## idapro.open_database(filename: str, run_auto_analysis: bool)
该函数的作用是打开二进制文件或者idb文件。第一个参数为文件名,第二个参数表示是否需要执行自动分析。该函数无返回值。
## ida_hexrays.decompile(func_ea: int)
获取指定函数的反编译结果
## idapro.close_database(save: bool = True)
关闭并保存数据库。该函数无返回值。
注意事项:
- 请在导入 idapro 模块后,再导入其他 IDAPython 模块。
- 需要使用uv run python运行的脚本才能导入idapro模块
为了让Gemini能访问到无头IDA,还需要执行以下命令:uv pip install /Applications/IDA\ Professional\ 9.2.app/Contents/MacOS/idalib/python/。
- AI辅助开发nvram_hook
参考资料
在之前的文章中,采用的是开发相应的nvram内核驱动。但是在随后的工作中发现,该方面的工作量有点大。因此在本次案例中,决定尝试采用hook nvram依赖库的函数。
我使用的提示词为:
我需要仿真一个程序,该程序依赖@libnvram-0.9.28.so 。但是该libnvram需要相应的内核驱动,而我的仿真环境没有,现在我需要你通过逆向该so库,帮我编写hook代码,我需要hook所有nvram相关的函数,让其nvram相关的操作不是和驱动进行交互,而是和指定文件(比如/tmp/nvram.bin)进行交互。
Gemini cli给我生成的代码:
// nvram_hook.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdint.h>
#define NVRAM_FILE "/tmp/nvram.bin"
typedefstruct kv {
char*key;
char*value;
struct kv *next;
} kv_t;
static kv_t *head = NULL;
staticint initialized =0;
staticvoidtrim(char*s){
char*p = s;
int l =strlen(p);
while(l > 0&&(p[l-1]=='\r'|| p[l-1]=='\n')) p[--l]=0;
}
staticvoidload_nvram(){
FILE *fp =fopen(NVRAM_FILE,"r");
if(!fp)return;
char line[1024];
while(fgets(line,sizeof(line), fp)){
trim(line);
char*eq =strchr(line,'=');
if(eq){
*eq =0;
char*key = line;
char*val = eq +1;
// Add to list
kv_t *node =malloc(sizeof(kv_t));
node->key =strdup(key);
node->value =strdup(val);
node->next = head;
head = node;
}
}
fclose(fp);
}
staticvoidsave_nvram(){
FILE *fp =fopen(NVRAM_FILE,"w");
if(!fp){
perror("Failed to open nvram file for writing");
return;
}
kv_t *curr = head;
while(curr){
fprintf(fp,"%s=%s\n", curr->key, curr->value);
curr = curr->next;
}
fclose(fp);
}
static kv_t *find_kv(constchar*key){
kv_t *curr = head;
while(curr){
// Check for alignment to avoid SIGBUS
if(((uintptr_t)curr &3)!=0){
fprintf(stderr,"nvram_hook: Corrupted linked list detected! curr=%p\n", curr);
return NULL;
}
if(curr->key &&strcmp(curr->key, key)==0)return curr;
curr = curr->next;
}
return NULL;
}
staticvoidset_kv(constchar*key,constchar*value){
kv_t *node =find_kv(key);
if(node){
free(node->value);
node->value =strdup(value);
}else{
node =malloc(sizeof(kv_t));
node->key =strdup(key);
node->value =strdup(value);
node->next = head;
head = node;
}
}
// Helper to format key based on index
staticvoidformat_key(char*buf,int size,int index,constchar*key){
if(index ==1){
snprintf(buf, size,"RTDEV_%s", key);
}else{
snprintf(buf, size,"%s", key);
}
}
// --- Exported Functions ---
intnvram_init(int index){
if(!initialized){
load_nvram();
initialized =1;
}
return0;
}
char*nvram_bufget(int index,constchar*key){
if(!initialized)nvram_init(0);
char real_key[256];
format_key(real_key,sizeof(real_key), index, key);
kv_t *node =find_kv(real_key);
if(node)return node->value;
return""; // Return empty string if not found, consistent with original behavior
}
intnvram_bufset(int index,constchar*key,constchar*value){
if(!initialized)nvram_init(0);
char real_key[256];
format_key(real_key,sizeof(real_key), index, key);
set_kv(real_key, value);
return0;
}
intnvram_commit(int index){
save_nvram();
return0;
}
// Wrappers
char*nvram_get(int index,constchar*key){
returnnvram_bufget(index, key);
}
intnvram_set(int index,constchar*key,constchar*value){
nvram_bufset(index, key, value);
returnnvram_commit(index);
}
intnvram_getall(char*buf,int len){
if(!initialized)nvram_init(0);
if(!buf || len <=0)return-1;
char*p = buf;
int rem = len;
kv_t *curr = head;
while(curr){
int l =snprintf(p, rem,"%s=%s", curr->key, curr->value);
if(l >= rem)break; // Buffer full
p += l +1; // Skip string and null terminator
rem -= l +1;
curr = curr->next;
}
if(rem > 0)*p =0; // Double null terminator
return0;
}
// Also hook standard close/clear if needed, but these are main ones.
intnvram_close(int index){
return0;
}
intnvram_clear(int index){
// Optional: clear memory and file
return0;
}
- AI辅助开发hook工具链
参考资料
2.1 AI辅助开发harness
下一步,需要让AI帮忙编写harness的代码,可以拿之前的代码让Gemini参考,然后结合对goahead的逆向来生成相关代码。
不过该步骤复杂度非常高,AI很难一两次就把任务完成的很完美,需要经历多次debug调整。不过,我们可以让AI先生成一个初版,使用的提示词为:
我需要对goahead进行fuzz,需要编写harness,请你参考@example_harness.c ,帮我编写一个goahead_harness。你能使用IDA对goahead进行逆向分析
2.2 AI辅助开发Makefile
开发完初版的harness后,需要让AI帮忙编写一个Makefile,把nvram_hook和harness编译到一起。最终结果为:
# Toolchain setup
TOOLCHAIN_DIR = /x/toolchain-mipsel/toolchain-3.4.x/bin
BUILDROOT_LIB = /x/buildroot-2016.02/output/host/usr/lib
# Compiler
CC = $(TOOLCHAIN_DIR)/mipsel-linux-uclibc-gcc
# Flags
CFLAGS = -g -fPIC -shared -Wall -I.
# Source files
SRCS = goahead_harness.c nvram_hook.c
# Targets
TARGET_HOOK = harness_hook.so
TARGET_NOHOOK = harness_nohook.so
TARGET_DEBUG = harness_debug.so
# Export LD_LIBRARY_PATH so the compiler finds its dependencies
export LD_LIBRARY_PATH := $(BUILDROOT_LIB):$(LD_LIBRARY_PATH)
all: $(TARGET_HOOK) $(TARGET_NOHOOK) $(TARGET_DEBUG)
$(TARGET_HOOK): $(SRCS)
$(CC) $(CFLAGS) -DENABLE_HOOK $^ -o $@ -ldl
$(TARGET_DEBUG): $(SRCS)
$(CC) $(CFLAGS) -DENABLE_HOOK -DDEBUG_ADDR $^ -o $@ -ldl
$(TARGET_NOHOOK): $(SRCS)
$(CC) $(CFLAGS) $^ -o $@ -ldl
clean:
rm -f $(TARGET_HOOK) $(TARGET_NOHOOK) $(TARGET_DEBUG)
有了Makefile,就方便我们生成harness.so文件,这样就可以通过实际测试判断harness的编写是否有问题。首先,可以通过harness_nohook判断nvram hook代码是否有问题,命令如下所示:
$ export AFL_PRELOAD=./hook/harness_nohook.so
$ /x/fuzz/AFLplusplus/afl-qemu-trace ./sbin/goahead
......
下一步,使用harness_debug.so来判断harness是否存在bug,命令如下所示:
$ export AFL_PRELOAD=./hook/harness_debug.so`
$ DEBUG=1 /x/fuzz/AFLplusplus/afl-qemu-trace ./sbin/goahead in/http.txt
......
使用上述命令能正常返回HTTP响应后,就表示harness已经开发完成,可以开始fuzz了。
我的harness最终如下所示:
#define _GNU_SOURCE
#include <sys/mman.h>
#include <dlfcn.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/syscall.h>
#include <poll.h>
#include <sys/sendfile.h>
#include <sys/stat.h>
#include <signal.h>
#include <ucontext.h>
#define MAX_PATH 0x100
staticint debug_enabled =0;
// Logging macro
#define LOG(...) do { \
if(debug_enabled){ \
fprintf(stderr,"harness: " __VA_ARGS__); \
fflush(stderr); \
} \
}while(0)
#ifdef DEBUG_ADDR
// Crash handler to print more context on SIGSEGV
voidsegv_handler(int sig, siginfo_t *info,void*ucontext){
ucontext_t *ctx =(ucontext_t*)ucontext;
// MIPS registers from ucontext
// Adjusted for uClibc MIPS which typically uses 'gregs' array
unsignedlong pc = ctx->uc_mcontext.pc;
unsignedlong ra = ctx->uc_mcontext.gregs[31];
unsignedlong sp = ctx->uc_mcontext.gregs[29];
fprintf(stderr,"\n[HARNESS] CRASH_INFO: PC=0x%lx, RA=0x%lx, SP=0x%lx\n", pc, ra, sp);
// Safe stack dump to help deduplication when PC is corrupted
if(sp > 0x1000){
unsignedlong*stack =(unsignedlong*)sp;
fprintf(stderr,"[HARNESS] STACK_DUMP: %08lx %08lx %08lx %08lx\n",
stack[0], stack[1], stack[2], stack[3]);
}
fflush(stderr);
// Restore default handler and re-raise to crash properly
signal(sig, SIG_DFL);
raise(sig);
}
#endif
// Address definitions
#define ADDR_SOCKET_OPEN 0x414300
#define ADDR_WEBS_READ_EVENT 0x4298b4
#define ADDR_WEBS_ALLOC 0x427e64
#define ADDR_SOCKET_ALLOC 0x4131b0
#define ADDR_SOCKET_FREE 0x4133b8
#define ADDR_WEBS 0x538db0
#define ADDR_WEBS_FREE 0x4281b4
#define ADDR_SOCKET_LIST 0x538da0
#define ADDR_UM_OPEN 0x41f860
#define ADDR_CMD_PORT 0x53ad84
// Addresses from nm
#define ADDR_BOPEN 0x00409a20
#define ADDR_ASCTOUNI 0x00412458
#define ADDR_WEBSSETDEFAULTDIR 0x0040bc2c
#define ADDR_WEBSSETIPADDR 0x0042958c
#define ADDR_WEBSSETHOST 0x00429500
#define ADDR_WEBSSETDEFAULTPAGE 0x0040bb8c
#define ADDR_WEBSSETPASSWORD 0x0041cf10
#define ADDR_WEBSOPENSERVER 0x00426920
#define ADDR_WEBSURLHANDLERDEFINE 0x00410b18
#define ADDR_WEBSAUTHHANDLER 0x0041877c
#define ADDR_WEBSFORMHANDLER 0x00410220
#define ADDR_WEBSCGIHANDLER 0x0040a580
#define ADDR_WEBSREBOOTHANDLER 0x00489924
#define ADDR_WEBSRESYNCFGHANDLER 0x004890a8
#define ADDR_WEBSEXSYSLOGHANDLER 0x00489d68
#define ADDR_WEBSREOPENPAGEHANDLER 0x00489e58
#define ADDR_WEBSDEFAULTHANDLER 0x0040b598
#define ADDR_FORMDEFINEWIZARD 0x00445668
#define ADDR_FORMDEFINEUTILITIES 0x004456fc
#define ADDR_FORMDEFINEINTERNET 0x0042dbf4
#define ADDR_FORMDEFINEUSB 0x004aba20
#define ADDR_FORMDEFINEWIRELESS 0x0048cc10
#define ADDR_FORMDEFINEFIREWALL 0x004a3608
#define ADDR_FORMDEFINEACLQOS 0x004ab900
#define ADDR_FORMDEFINEACLFIREWALL 0x004a8830
#define ADDR_FORMDEFINELOGIN 0x0041c844
#define ADDR_FORMDEFINEVOIP 0x004871a0
#define ADDR_FORMDEFINEMANAGEMENT 0x0046d450
#define ADDR_WEBSSSLOPEN 0x004c4708
#define ADDR_INITVOIPDEVICE 0x004740fc
#define ADDR_VOIP_PARAM_SYN_CHECK 0x004409c8
#define ADDR_INITMENU 0x0044dbc0
#define ADDR_INITTIMERTASK 0x00488a24
#define ADDR_WEB_ACL_ACCESS_SET 0x00416018
// Hardcoded static function from analysis
#define ADDR_ROOT_HANDLER 0x48B6BC
// Struct offsets
#define SOCKET_FD_OFFSET 176
// Function typedefs
typedefint(*BOPEN_T)(void*buf,int bufsize,int flags);
typedefvoid(*ASCTOUNI_T)(uint16_t *dst,char*src,int len);
typedefvoid(*WEBS_READ_EVENT_T)(void*wp);
typedefint(*WEBS_ALLOC_T)(int sid);
typedefvoid(*WEBS_FREE_T)(void*wp);
typedefint(*SOCKET_ALLOC_T)(char*host,int port,void*accept,int flags);
typedefvoid(*SOCKET_FREE_T)(int sid);
typedefint(*SOCKET_OPEN_T)(void);
typedefvoid(*UM_OPEN_T)(void);
typedefvoid(*WEBS_SET_DEFAULT_DIR_T)(char*dir);
typedefvoid(*WEBS_SET_IPADDR_T)(char*ip);
typedefvoid(*WEBS_SET_HOST_T)(char*host);
typedefvoid(*WEBS_SET_DEFAULT_PAGE_T)(char*page);
typedefvoid(*WEBS_SET_PASSWORD_T)(char*password);
typedefint(*WEBS_URL_HANDLER_DEFINE_T)(char*url,void*handler,int arg,void*function,int flags);
typedefint(*WEBS_OPEN_SERVER_T)(int port,int retries);
typedefvoid(*WEB_ACL_ACCESS_SET_T)(int enabled);
typedefvoid(*FORM_DEFINE_T)(void);
// Function macros
#define bopen ((BOPEN_T)ADDR_BOPEN)
#define ascToUni ((ASCTOUNI_T)ADDR_ASCTOUNI)
#define socketOpen ((SOCKET_OPEN_T)ADDR_SOCKET_OPEN)
#define umOpen ((UM_OPEN_T)ADDR_UM_OPEN)
#define websAlloc ((WEBS_ALLOC_T)ADDR_WEBS_ALLOC)
#define websFree ((WEBS_FREE_T)ADDR_WEBS_FREE)
#define websReadEvent ((WEBS_READ_EVENT_T)ADDR_WEBS_READ_EVENT)
#define socketAlloc ((SOCKET_ALLOC_T)ADDR_SOCKET_ALLOC)
#define socketFree ((SOCKET_FREE_T)ADDR_SOCKET_FREE)
#define websSetDefaultDir ((WEBS_SET_DEFAULT_DIR_T)ADDR_WEBSSETDEFAULTDIR)
#define websSetIpaddr ((WEBS_SET_IPADDR_T)ADDR_WEBSSETIPADDR)
#define websSetHost ((WEBS_SET_HOST_T)ADDR_WEBSSETHOST)
#define websSetDefaultPage ((WEBS_SET_DEFAULT_PAGE_T)ADDR_WEBSSETDEFAULTPAGE)
#define websSetPassword ((WEBS_SET_PASSWORD_T)ADDR_WEBSSETPASSWORD)
#define websUrlHandlerDefine ((WEBS_URL_HANDLER_DEFINE_T)ADDR_WEBSURLHANDLERDEFINE)
#define websOpenServer ((WEBS_OPEN_SERVER_T)ADDR_WEBSOPENSERVER)
#define WebAclAccessSet ((WEB_ACL_ACCESS_SET_T)ADDR_WEB_ACL_ACCESS_SET)
#define formDefineWizard ((FORM_DEFINE_T)ADDR_FORMDEFINEWIZARD)
#define formDefineUtilities ((FORM_DEFINE_T)ADDR_FORMDEFINEUTILITIES)
#define formDefineInternet ((FORM_DEFINE_T)ADDR_FORMDEFINEINTERNET)
#define formDefineUSB ((FORM_DEFINE_T)ADDR_FORMDEFINEUSB)
#define formDefineWireless ((FORM_DEFINE_T)ADDR_FORMDEFINEWIRELESS)
#define formDefineFirewall ((FORM_DEFINE_T)ADDR_FORMDEFINEFIREWALL)
#define formDefineACLQoS ((FORM_DEFINE_T)ADDR_FORMDEFINEACLQOS)
#define formDefineACLFirewall ((FORM_DEFINE_T)ADDR_FORMDEFINEACLFIREWALL)
#define formDefineLogin ((FORM_DEFINE_T)ADDR_FORMDEFINELOGIN)
#define formDefineVoip ((FORM_DEFINE_T)ADDR_FORMDEFINEVOIP)
#define formDefineManagement ((FORM_DEFINE_T)ADDR_FORMDEFINEMANAGEMENT)
#define initMenu ((FORM_DEFINE_T)ADDR_INITMENU)
#define initTimerTask ((FORM_DEFINE_T)ADDR_INITTIMERTASK)
// uClibc hook related
typedefint(*uClibc_main_t)(int(*main)(int,char**,char**),
int argc,char**argv,
void(*app_init)(void),
void(*app_fini)(void),
void(*rtld_fini)(void),
void*stack_end);
static uClibc_main_t uClibc_main_orig = NULL;
staticint client_fd =-1; // Global to store the client side of the socket pair
staticint global_sid =-1;
staticchar*global_sock = NULL;
// Helper to create a socket pair populated with file content
staticintcreate_socket_from_file(constchar*filename)
{
int sv[2];
if(socketpair(AF_UNIX, SOCK_STREAM,0, sv)<0){
perror("socketpair");
return-1;
}
client_fd = sv[0]; // sv[0] is for writing data (simulating client)
// sv[1] is for reading data (simulating server/GoAhead)
int file_fd =open(filename, O_RDONLY);
if(file_fd <0){
perror("open file");
close(sv[0]);
close(sv[1]);
client_fd =-1;
return-1;
}
// Use lseek instead of fstat to avoid potential struct stat ABI mismatches
off_t file_size =lseek(file_fd,0, SEEK_END);
if(file_size <0){
perror("lseek size");
close(file_fd);
close(sv[0]);
close(sv[1]);
client_fd =-1;
return-1;
}
if(lseek(file_fd,0, SEEK_SET)<0){
perror("lseek rewind");
close(file_fd);
close(sv[0]);
close(sv[1]);
client_fd =-1;
return-1;
}
LOG("File %s size detected as: %ld bytes\n", filename,(long)file_size);
off_t offset =0;
size_t remaining = file_size;
while(remaining > 0){
ssize_t sent =sendfile(sv[0], file_fd,&offset, remaining);
if(sent <0){
if(errno == EINTR || errno == EAGAIN)continue;
perror("sendfile");
close(file_fd);
close(sv[0]);
close(sv[1]);
client_fd =-1;
return-1;
}
remaining -= sent;
}
close(file_fd);
// Shutdown write on client side to indicate EOF
shutdown(sv[0], SHUT_WR);
return sv[1];
}
voidinit(){
// Set global configuration
*(int*)ADDR_CMD_PORT =8080;
// Initialize Memory Allocator (Must be first)
bopen(NULL,61440,1);
// Initialize GoAhead structures
socketOpen();
umOpen();
// Convert strings to Unicode as seen in main()
uint16_t uniIp[128];
uint16_t uniHost[128];
ascToUni(uniIp,"0.0.0.0",8);
ascToUni(uniHost,"localhost",10);
// Web Server Setup matching main()
websSetDefaultDir("/etc_ro/web");
websSetIpaddr((char*)uniIp);
websSetHost((char*)uniHost);
websSetDefaultPage("default.asp");
websSetPassword("");
// Initialize the server structures (crucial)
websOpenServer(8080,0);
// Enable Access Control
WebAclAccessSet(1);
// Handlers
websUrlHandlerDefine("",0,0,(void*)ADDR_WEBSAUTHHANDLER,1);
websUrlHandlerDefine("/goform",0,0,(void*)ADDR_WEBSFORMHANDLER,0);
websUrlHandlerDefine("/cgi-bin",0,0,(void*)ADDR_WEBSCGIHANDLER,0);
websUrlHandlerDefine("/reboot.asp",0,0,(void*)ADDR_WEBSREBOOTHANDLER,0);
websUrlHandlerDefine("/reboot",0,0,(void*)ADDR_WEBSREBOOTHANDLER,0);
websUrlHandlerDefine("/resyncfg",0,0,(void*)ADDR_WEBSRESYNCFGHANDLER,0);
websUrlHandlerDefine("/exception_log",0,0,(void*)ADDR_WEBSEXSYSLOGHANDLER,0);
websUrlHandlerDefine("/dhcpcliinfo",0,0,(void*)ADDR_WEBSREOPENPAGEHANDLER,0);
websUrlHandlerDefine("/auto_ch",0,0,(void*)ADDR_WEBSREOPENPAGEHANDLER,0);
websUrlHandlerDefine("/auto_5G_ch",0,0,(void*)ADDR_WEBSREOPENPAGEHANDLER,0);
websUrlHandlerDefine("/registration",0,0,(void*)ADDR_WEBSREOPENPAGEHANDLER,0);
websUrlHandlerDefine("",0,0,(void*)ADDR_WEBSDEFAULTHANDLER,2);
// Forms
formDefineWizard();
formDefineUtilities();
formDefineInternet();
formDefineUSB();
formDefineWireless();
formDefineFirewall();
formDefineACLQoS();
formDefineACLFirewall();
formDefineLogin();
formDefineVoip();
formDefineManagement();
// Root handler
websUrlHandlerDefine("/",0,0,(void*)ADDR_ROOT_HANDLER,0);
// Misc
initMenu();
initTimerTask();
// Pre-allocate a socket for fuzzing to save cycles
global_sid =socketAlloc("127.0.0.1",80, NULL,0);
if(global_sid <0){
LOG("Failed to allocate global socket in init\n");
exit(1);
}
// Cache the socket structure pointer
void**socket_list =*(void***)ADDR_SOCKET_LIST;
if(socket_list){
global_sock =(char*)socket_list[global_sid];
}
if(!global_sock){
LOG("Failed to get global socket structure in init\n");
exit(1);
}
LOG("GoAhead initialized with handlers.\n");
}
voidfuzz(constchar*filename)
{
int memfd; // Actually it's sv[1] (server socket)
LOG("do fuzz(%s)\n", filename);
memfd =create_socket_from_file(filename);
if(memfd ==-1){
perror("create_socket_from_file error.");
return;
}
// Reuse pre-allocated socket structure
if(!global_sock){
LOG("Global socket not initialized\n");
close(memfd);
if(client_fd !=-1)close(client_fd);
client_fd =-1;
return;
}
// Set the FD in the socket structure
*(int*)(global_sock + SOCKET_FD_OFFSET)= memfd;
LOG("Hijacked socket %d with server-socket %d (client-socket %d)\n", global_sid, memfd, client_fd);
// Allocate webs structure
int wid =websAlloc(global_sid);
if(wid <0){
LOG("websAlloc failed\n");
return;
}
// Get webs structure
void**webs_arr =*(void***)ADDR_WEBS;
if(!webs_arr){
LOG("webs array is NULL\n");
return;
}
void*wp = webs_arr[wid];
if(!wp){
LOG("webs structure is NULL for wid %d\n", wid);
return;
}
LOG("Starting websReadEvent processing...\n");
websReadEvent(wp);
LOG("websReadEvent finished.\n");
// Dump response if DEBUG is set
#ifdef DEBUG_ADDR
if(client_fd !=-1){
char dump_buf[4096];
LOG("\n--- HTTP RESPONSE START ---\n");
// Setting O_NONBLOCK on client_fd
int flags =fcntl(client_fd, F_GETFL,0);
fcntl(client_fd, F_SETFL, flags | O_NONBLOCK);
struct pollfd pfd;
pfd.fd = client_fd;
pfd.events = POLLIN;
// Try to read for up to 500ms
int total_timeout =500;
int elapsed =0;
while(elapsed < total_timeout){
int ret =poll(&pfd,1,100); // Wait 100ms
if(ret <0){
perror("poll");
break;
}
if(ret ==0){
elapsed +=100;
continue; // Timeout, try again until total_timeout
}
if(pfd.revents & POLLIN){
// Data available
ssize_t n =read(client_fd, dump_buf,sizeof(dump_buf));
if(n > 0){
fwrite(dump_buf,1, n, stdout);
// Reset timeout if we got data, expecting more
elapsed =0;
}elseif(n ==0){
break; // EOF
}else{
if(errno != EAGAIN && errno != EWOULDBLOCK){
perror("read");
break;
}
}
}elseif(pfd.revents &(POLLHUP | POLLERR | POLLNVAL)){
break; // Disconnected
}
}
LOG("\n--- HTTP RESPONSE END ---\n");
fflush(stdout);
}
#endif
// Cleanup
if(client_fd !=-1)close(client_fd);
client_fd =-1;
close(memfd);
// Free GoAhead structures to prevent resource exhaustion in persistent mode
if(wid >=0&& webs_arr){
void*wp_to_free = webs_arr[wid];
if(wp_to_free){
websFree(wp_to_free);
}
}
// Socket is reused, do NOT free it
// if (sid >= 0) socketFree(sid);
}
#ifdef ENABLE_HOOK
// Fake bind/listen to speed up initialization
intbind(int sockfd,conststruct sockaddr *addr, socklen_t addrlen){
return0;
}
intlisten(int sockfd,int backlog){
return0;
}
// Hook trace functions to suppress GoAhead output
voidtrace(int level,char*fmt,...){
return;
}
voidtraceRaw(char*fmt,...){
return;
}
#endif
intmain_hook(int argc,char**argv,char**envp){
(void)envp;
#ifdef DEBUG_ADDR
printf("fuzz function address: %p\n", fuzz);
#endif
if(argc <2){
printf("./goahead [filename]\n");
return0;
}
init();
LOG("harness hook\n");
fuzz(argv[1]);
LOG("success exit\n");
return0;
}
int__uClibc_main(
int(*main)(int,char**,char**),
int argc,
char**argv,
void(*app_init)(void),
void(*app_fini)(void),
void(*rtld_fini)(void),
void*stack_end){
if(!uClibc_main_orig){
uClibc_main_orig =dlsym(RTLD_NEXT,"__uClibc_main");
if(!uClibc_main_orig){
LOG("dlsym(RTLD_NEXT, __uClibc_main_orig) failed: %s\n",dlerror());
_exit(1);
}
}
#ifdef ENABLE_HOOK
LOG("ENABLE_HOOK defined, using main_hook\n");
returnuClibc_main_orig(main_hook, argc, argv, app_init, app_fini, rtld_fini, stack_end);
#else
LOG("ENABLE_HOOK not defined, using original main\n");
returnuClibc_main_orig(main, argc, argv, app_init, app_fini, rtld_fini, stack_end);
#endif
}
__attribute__((constructor))
staticvoidharness_init(void)
{
if(getenv("DEBUG")){
debug_enabled =1;
}
LOG("constructor executed: harness.so loaded\n");
#ifdef DEBUG_ADDR
// Register crash handler
struct sigaction sa;
memset(&sa,0,sizeof(sa));
sa.sa_sigaction = segv_handler;
sa.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV,&sa, NULL);
#endif
uClibc_main_orig =dlsym(RTLD_NEXT,"__uClibc_main");
}
- AI辅助给QEMU增加功能
参考资料
在我调试harness的过程中,发现一个问题:
一般要仿真IoT的Web服务,绝大部分情况下都要使用chroot,切换到IoT目录上。在之前的httpd案例中,因为只需要使用到/www目录,我就只做了简单的符号链接。但是在goahead案例中,需要依赖非常多IoT的许多配置文件,如果一个一个都使用符号链接,比较麻烦。这个时候最合适的就是使用chroot,但是使用chroot却会遇到两个问题。
- 需要使用到root权限。
- 需要解决qemu依赖的问题。
因此,我就在考虑,为什么qemu中不能内置chroot类似的功能呢?使用的提示词为:
我想通过patch qemu实现类似fakechroot的功能。我的设想是,可以设置一个环境变量,表示根目录路径。然后判断该变量是否存在,如何存在,所有操作文件的调用,都增加该路径。比如,我设置ROOT_PATH=/home/ubuntu/root,然后仿真的程序需要读取/etc/passwd,则实际读取的是/home/ubuntu/root/etc/passwd。如果读取./etc/passwd,那么实际还是读取./etc/passwd,当路径名称以ROOT_PATH开头,则不进行添加前置目录。另外,你不仅需要考虑读操作,还有写操作,还有stat操作,或者还有其他我没考虑到,需要操作文件的调用。
最终得到以下patch代码:
diff --git a/include/qemu/path.h b/include/qemu/path.h
index c6292a9709..9187901cbb 100644
--- a/include/qemu/path.h
+++ b/include/qemu/path.h
@@ -2,6+2,7 @@
#define QEMU_PATH_H
voidinit_paths(constchar*prefix);
+voidinit_paths_forced(constchar*prefix);
constchar*path(constchar*pathname);
#endif
diff --git a/linux-user/main.c b/linux-user/main.c
index b0172d86fb..ef79f98b9d 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -892,7+892,11 @@ intmain(int argc,char**argv,char**envp)
memset(&bprm,0,sizeof(bprm));
/* Scan interp_prefix dir for replacement files. */
-init_paths(interp_prefix);
+if(getenv("ROOT_PATH")){
+init_paths_forced(getenv("ROOT_PATH"));
+}else{
+init_paths(interp_prefix);
+}
init_qemu_uname_release();
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5461f789f7..b300c9d4a7 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7347,7+7347,7 @@ staticinline abi_long target_truncate64(CPUArchState *cpu_env,constchar*arg1
arg2 = arg3;
arg3 = arg4;
}
-returnget_errno(truncate(arg1,target_offset64(arg2, arg3)));
+returnget_errno(truncate(path(arg1),target_offset64(arg2, arg3)));
}
#endif
@@ -8713,8+8713,8 @@ staticintdo_execv(CPUArchState *cpu_env,int dirfd,
exe = exec_path;
}
ret = is_execveat
-?safe_execveat(dirfd, exe, argp, envp, flags)
-:safe_execve(exe, argp, envp);
+?safe_execveat(dirfd,path(exe), argp, envp, flags)
+:safe_execve(path(exe), argp, envp);
ret =get_errno(ret);
unlock_user(p, pathname,0);
@@ -9491,7+9491,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!p ||!p2)
ret =-TARGET_EFAULT;
else
- ret =get_errno(link(p, p2));
+ ret =get_errno(link(path(p),path(p2)));
unlock_user(p2, arg2,0);
unlock_user(p, arg1,0);
}
@@ -9518,7+9518,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_unlink:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(unlink(p));
+ ret =get_errno(unlink(path(p)));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9526,7+9526,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_unlinkat:
if(!(p =lock_user_string(arg2)))
return-TARGET_EFAULT;
- ret =get_errno(unlinkat(arg1, p, arg3));
+ ret =get_errno(unlinkat(arg1,path(p), arg3));
unlock_user(p, arg2,0);
return ret;
#endif
@@ -9537,7+9537,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_chdir:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(chdir(p));
+ ret =get_errno(chdir(path(p)));
unlock_user(p, arg1,0);
return ret;
#ifdef TARGET_NR_time
@@ -9556,7+9556,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_mknod:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(mknod(p, arg2, arg3));
+ ret =get_errno(mknod(path(p), arg2, arg3));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9572,7+9572,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_chmod:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(chmod(p, arg2));
+ ret =get_errno(chmod(path(p), arg2));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9630,9+9630,9 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
* string.
*/
if(!arg5){
- ret =mount(p, p2, p3,(unsignedlong)arg4, NULL);
+ ret =mount(path(p),path(p2), p3,(unsignedlong)arg4, NULL);
}else{
- ret =mount(p, p2, p3,(unsignedlong)arg4,g2h(cpu, arg5));
+ ret =mount(path(p),path(p2), p3,(unsignedlong)arg4,g2h(cpu, arg5));
}
ret =get_errno(ret);
@@ -9654,7+9654,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
#endif
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(umount(p));
+ ret =get_errno(umount(path(p)));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9677,7+9677,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
unlock_user(p2, arg2,0);
return-TARGET_EFAULT;
}
- ret =get_errno(sys_move_mount(arg1, p2, arg3, p4, arg5));
+ ret =get_errno(sys_move_mount(arg1,path(p2), arg3,path(p4), arg5));
unlock_user(p2, arg2,0);
unlock_user(p4, arg4,0);
@@ -9705,7+9705,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
host_flags |= O_CLOEXEC;
}
- ret =get_errno(sys_open_tree(arg1, p2, host_flags));
+ ret =get_errno(sys_open_tree(arg1,path(p2), host_flags));
unlock_user(p2, arg2,0);
@@ -9751,7+9751,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
}
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(utime(p, host_tbuf));
+ ret =get_errno(utime(path(p), host_tbuf));
unlock_user(p, arg1,0);
}
return ret;
@@ -9771,7+9771,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
}
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(utimes(p, tvp));
+ ret =get_errno(utimes(path(p), tvp));
unlock_user(p, arg1,0);
}
return ret;
@@ -9811,7+9811,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!(p =lock_user_string(arg2))){
return-TARGET_EFAULT;
}
- ret =get_errno(faccessat(arg1, p, arg3,0));
+ ret =get_errno(faccessat(arg1,path(p), arg3,0));
unlock_user(p, arg2,0);
return ret;
#endif
@@ -9820,7+9820,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!(p =lock_user_string(arg2))){
return-TARGET_EFAULT;
}
- ret =get_errno(faccessat(arg1, p, arg3, arg4));
+ ret =get_errno(faccessat(arg1,path(p), arg3, arg4));
unlock_user(p, arg2,0);
return ret;
#endif
@@ -9846,7+9846,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!p ||!p2)
ret =-TARGET_EFAULT;
else
- ret =get_errno(rename(p, p2));
+ ret =get_errno(rename(path(p),path(p2)));
unlock_user(p2, arg2,0);
unlock_user(p, arg1,0);
}
@@ -9861,7+9861,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!p ||!p2)
ret =-TARGET_EFAULT;
else
- ret =get_errno(renameat(arg1, p, arg3, p2));
+ ret =get_errno(renameat(arg1,path(p), arg3,path(p2)));
unlock_user(p2, arg4,0);
unlock_user(p, arg2,0);
}
@@ -9876,7+9876,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!p ||!p2){
ret =-TARGET_EFAULT;
}else{
- ret =get_errno(sys_renameat2(arg1, p, arg3, p2, arg5));
+ ret =get_errno(sys_renameat2(arg1,path(p), arg3,path(p2), arg5));
}
unlock_user(p2, arg4,0);
unlock_user(p, arg2,0);
@@ -9887,7+9887,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_mkdir:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(mkdir(p, arg2));
+ ret =get_errno(mkdir(path(p), arg2));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9895,7+9895,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_mkdirat:
if(!(p =lock_user_string(arg2)))
return-TARGET_EFAULT;
- ret =get_errno(mkdirat(arg1, p, arg3));
+ ret =get_errno(mkdirat(arg1,path(p), arg3));
unlock_user(p, arg2,0);
return ret;
#endif
@@ -9903,7+9903,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_rmdir:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(rmdir(p));
+ ret =get_errno(rmdir(path(p)));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9955,7+9955,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_umount2:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(umount2(p, arg2));
+ ret =get_errno(umount2(path(p), arg2));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -9972,7+9972,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_chroot:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(chroot(p));
+ ret =get_errno(chroot(path(p)));
unlock_user(p, arg1,0);
return ret;
#ifdef TARGET_NR_dup2
@@ -10577,7+10577,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!p ||!p2)
ret =-TARGET_EFAULT;
else
- ret =get_errno(symlink(p, p2));
+ ret =get_errno(symlink(path(p),path(p2)));
unlock_user(p2, arg2,0);
unlock_user(p, arg1,0);
}
@@ -10592,7+10592,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
if(!p ||!p2)
ret =-TARGET_EFAULT;
else
- ret =get_errno(symlinkat(p, arg2, p2));
+ ret =get_errno(symlinkat(path(p), arg2,path(p2)));
unlock_user(p2, arg3,0);
unlock_user(p, arg1,0);
}
@@ -10641,7+10641,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_swapon:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(swapon(p, arg2));
+ ret =get_errno(swapon(path(p), arg2));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -10737,7+10737,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_truncate:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(truncate(p, arg2));
+ ret =get_errno(truncate(path(p), arg2));
unlock_user(p, arg1,0);
return ret;
#endif
@@ -11125,7+11125,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env,int num, abi_long arg1,
case TARGET_NR_swapoff:
if(!(p =lock_user_string(arg1)))
return-TARGET_EFAULT;
- ret =get_errno(swapoff(p));
+ ret =get_errno(swapoff(path(p)));
unlock_user(p, arg1,0);
return ret;
#endif
diff --git a/util/path.c b/util/path.c
index 8e174eb436..2bb9fe40ad 100644
--- a/util/path.c
+++ b/util/path.c
@@ -13,6+13,7 @@
staticconstchar*base;
static GHashTable *hash;
static QemuMutex lock;
+static bool force_prefix;
voidinit_paths(constchar*prefix)
{
@@ -32,6+33,12 @@ voidinit_paths(constchar*prefix)
qemu_mutex_init(&lock);
}
+voidinit_paths_forced(constchar*prefix)
+{
+init_paths(prefix);
+ force_prefix =true;
+}
+
/* Look for path in emulation dir, otherwise return name. */
constchar*path(constchar*name)
{
@@ -43,6+50,13 @@ constchar*path(constchar*name)
if(!base ||!name || name[0]!='/'){
return name;
}
+constchar*p;
+if(strstart(name, base,&p)){
+if(*p =='\0'||*p =='/'||(p > name && p[-1]=='/')){
+return name;
+}
+}
+
qemu_mutex_lock(&lock);
/* Have we looked up this file before? */
@@ -53,7+67,7 @@ constchar*path(constchar*name)
char*full =g_build_filename(base, name, NULL);
/* Look for the path; record the result, pass or fail. */
-if(access(full, F_OK)==0){
+if(force_prefix ||access(full, F_OK)==0){
/* Exists. */
g_hash_table_insert(hash, save, full);
ret = full;
- AI辅助去重
参考资料
当程序能正常运行了,我们就可以开始fuzz,如果运气不错,则很快就能fuzz出结果,比如我获取到了以下结果:
total crashes : 5816 (62 saved)
我得到了62个能造成crash的payload,但是经过调试,能发现很多payload的触发点是一样的,因此我们需要对payload进行去重。我们可以让AI帮忙编写去重脚本,经过调试和修改,最终得到以下代码:
import os
import sys
import subprocess
import shutil
import re
import argparse
import glob
def get_crash_info(qemu_cmd, binary, harness, crash_file, timeout=2, env_vars=None):
cmd = list(qemu_cmd)
cmd.append(binary)
cmd.append(crash_file)
env = os.environ.copy()
if env_vars:
env.update(env_vars)
if harness and os.path.exists(harness):
env['LD_PRELOAD']= harness
try:
# 使用 Popen 以确保能完整读取由于信号中断可能留在缓冲区的数据
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
env=env
)
try:
stdout_data, stderr_data = process.communicate(timeout=timeout)
except subprocess.TimeoutExpired:
process.kill()
stdout_data, stderr_data = process.communicate()
return"HANG"
full_output =(stdout_data + stderr_data).decode('utf-8', errors='ignore')
# 极致宽松的正则:直接寻找 PC/RA/SP 关键字,无视前面的 [HARNESS] 标签和特定空格
info_match = re.search(r'PC=(0x[0-9a-fA-F]+)[,\s]+RA=(0x[0-9a-fA-F]+)[,\s]+SP=(0x[0-9a-fA-F]+)', full_output)
if info_match:
pc = info_match.group(1).lower()
ra = info_match.group(2).lower()
sp = info_match.group(3).lower()
# 当 PC == RA 时,通常意味着栈上的返回地址被覆盖,
# 导致 RA 寄存器加载了溢出数据,随后 PC 跳转到该地址。
# 此时 PC 和 RA 的值取决于输入数据,无法作为去重特征。
# 因此,这种情况下改用 SP (栈指针) 作为去重依据。
if pc == ra:
returnf"STACK_OVERFLOW_SP_{sp}"
# 统一使用 PC + RA 进行去重。
# 这涵盖了主程序(0x004)、libc库函数(0x2b)以及PC劫持(0x41)的情况。
# 只要 PC 或 RA 不同,就视为不同的 Crash。
returnf"PC_{pc}_RA_{ra}_SP_{sp}"
# 兜底逻辑
if"uncaught target signal"in full_output or process.returncode in[-11,139,-6,134]:
return"UNKNOWN_SIGNAL"
except Exception ase:
print(f"[-] Error: {e}")
return None
def main():
parser = argparse.ArgumentParser(description="Smart Deduplication for Corrupted PC Crashes.")
parser.add_argument("--binary", default="./goahead", help="Target binary path")
parser.add_argument("--harness", default="./harness_debug.so", help="Harness .so path")
parser.add_argument("--crash-dir", default="./crash", help="Directory containing crash files")
parser.add_argument("--output-dir", default="./unique_crashes", help="Output directory")
parser.add_argument("--qemu", default="qemu-mipsel", help="QEMU binary command")
parser.add_argument("--sysroot", help="QEMU sysroot path (passed as -L)")
args = parser.parse_args()
qemu_base =[args.qemu]
if args.sysroot:
qemu_base.extend(["-L", args.sysroot])
ifnot os.path.exists(args.binary):
print(f"[!] Binary not found: {args.binary}")
return
ifnot os.path.isdir(args.crash_dir):
print(f"[!] Crash directory not found: {args.crash_dir}")
return
files = sorted(glob.glob(os.path.join(args.crash_dir,"*")))
ifnot files:
print(f"[!] No files found in {args.crash_dir}")
return
print(f"[*] Found {len(files)} files. Starting smart analysis...")
os.makedirs(args.output_dir, exist_ok=True)
unique_crashes ={}
count =0
value =1
forfin files:
if os.path.basename(f).startswith("."):continue
count +=1
sys.stdout.write(f"\rProcessing {count}/{len(files)}")
sys.stdout.flush()
signature = get_crash_info(qemu_base, args.binary, args.harness,f)
if signature:
if signature notin unique_crashes:
print(f"\n[+] Unique Crash: {signature} (File: {os.path.basename(f)})")
unique_crashes[signature]=f
safe_sig = re.sub(r'[^a-zA-Z0-9_]','', signature)[:120]
dest = os.path.join(args.output_dir,f"crash{value}")
value +=1
shutil.copy2(f, dest)
print(f"\n[*] Done. {len(unique_crashes)} unique crashes saved to {args.output_dir}")
if __name__ =="__main__":
main()
# uv run dedup_fuzz.py --qemu /home/debian/fuzz/AFLplusplus/afl-qemu-trace --crash-dir crash --output-dir unique_crashes --binary ./sbin/goahead
去重脚本的编写思路最开始我只想到一种,依靠gdb获取奔溃信息,然后通过解析奔溃信息来去重。缺点是gdb运行速度慢。随后在AI的协助下,发现另外两种方案:
- 依靠QEMU的日志信息,获取崩溃前的地址。
- hook 崩溃信号,实现输出奔溃信息功能的信号hook函数。
两种方案我都试了一下,第一种方案因为需要进行日志输出,速度也慢,但是去重效果非常准确。第二种速度最快,目前采用的就是第二种方案,信号hook函数在前文harness代码中。
- AI辅助编写fuzz字典
参考资料
为了增加fuzz效率,还可以设置fuzz字典,AFL本身就存在http字典,路径为:AFLplusplus/dictionaries/http.dict。在此基础上,让AI帮忙生成针对当前程序更合适的字典,使用的提示词为:
你参考一下afl fuzz的http字典,如下所示:
# Sources: https://en.wikipedia.org/wiki/List_of_HTTP_header_fields
# misc
"HTTP/1.1"
# verbs
"CONNECT"
"DELETE"
"GET"
"HEAD"
再结合对httpd的逆向信息,来编辑一个针对改设备的fuzz字典。你还可以思考一下,是否通过逆向的方式太慢,也不够全,如果提取.data段的字符串呢?是否更合适?
使用字典进行fuzz的启动命令为:afl-fuzz -x goahead.dict -Q -i in -o out -- ./sbin/goahead @@
- AI辅助分析crash
参考资料
如果运气好,或者fuzz的程序问题较多,那么你可能获得到大量的crash数据,就算去重后,也存在大量信息,如果自己一个一个分析过去,速度有点慢,所以我考虑是否能让AI来辅助我进行分析。
准备工作如下所示:
$ cat crash/crash1
POST ?go/setSip_account /1.1
Host: 10:8080
Connection: keep-alive
Content-Length: 9#0
Cache-Control:IP_TRUNK_OUTBO max-age=0
Origin: http://10.0.25.10:8080
DNT: 1
Upgrade-Insecure-Requestsre-Requests: 1
Content-Type: a: 1
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Macintosh; Intel37.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36
Accept: Text/html,apcatio�/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Referer: http://10.0.25.10:8080/FXO/sip_trup-alive
Content-Length:nk.asp
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7
Cookie: ASPSSIONID=SIASTHVKP3WZDIP7BHABHH3CK01LUWNH
CV3VBH8080
ConnectV6C
$ cat crash/crash1.txt
$ afl-qemu-trace -d in_asm -D trace.log ./sbin/goahead unique_crashes/crash1
fuzz function address: 0x2b2f7b10
FUN=websOpenServer, port=8080,cmd_port=8080,cmd_web_ssl_disable=0
[HARNESS] CRASH_INFO: PC=0x2b6e2630, RA=0x42b024, SP=0x2b2a9f60
[HARNESS] STACK_DUMP: 00000000 2b2aa1b8 2b2aa1bc 00588e4f
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
[1] 1931096 segmentation fault afl-qemu-trace -d in_asm -D trace.log
$ tail -n 50 trace.log
0x0042a2a8: lw a0,664(sp)
0x0042a2ac: nop
0x0042a2b0: lw s0,324(a0)
0x0042a2b4: nop
0x0042a2b8: srl v0,s0,0xe
0x0042a2bc: andi v0,v0,0x1
0x0042a2c0: beqz v0,0x42ae94
0x0042a2c4: li v0,-1
----------------
IN(1929072):
0x0042ae94: lw v1,664(sp)
0x0042ae98: nop
0x0042ae9c: lw s1,268(v1)
0x0042aea0: nop
0x0042aea4: beqz s1,0x42af3c
0x0042aea8: nop
----------------
IN(1929072):
0x0042aeac: lb v0,0(s1)
0x0042aeb0: nop
0x0042aeb4: beqz v0,0x42aec8
0x0042aeb8: srl v0,s0,0xd
----------------
IN(1929072):
0x0042aebc: andi v0,v0,0x1
0x0042aec0: beqz v0,0x42affc
0x0042aec4: nop
----------------
IN(1929072):
0x0042affc: lw t9,-32292(gp)
----------------
IN(1929072):
0x0042b000: nop
0x0042b004: jalr t9
0x0042b008: move a0,s1
----------------
IN(1929072):
0x0042b00c: lw gp,40(sp)
0x0042b010: lw a0,568(sp)
0x0042b014: lw t9,-32292(gp)
0x0042b018: nop
0x0042b01c: jalr t9
0x0042b020: move s0,v0
$ cat mem.txt
400000-4ed000 r-xp 00000000 fe:00 144595 /squashfs-root/sbin/goahead
4ed000-52c000 ---p 00000000 00:00 0
52c000-539000 rw-p 000ec000 fe:00 144595 /squashfs-root/sbin/goahead
539000-56f000 rw-p 00000000 00:00 0
2aaab000-2aaac000 ---p 00000000 00:00 0
2aaac000-2b2ac000 rwxp 00000000 00:00 0 [stack]
2b2ac000-2b2b2000 r-xp 00000000 fe:00 144231 /squashfs-root/lib/ld-uClibc-0.9.28.so
2b2b2000-2b2f1000 ---p 00000000 00:00 0
2b2f1000-2b2f2000 r--p 00005000 fe:00 144231 /squashfs-root/lib/ld-uClibc-0.9.28.so
2b2f2000-2b2f3000 rw-p 00006000 fe:00 144231 /squashfs-root/lib/ld-uClibc-0.9.28.so
2b2f3000-2b2f4000 r-xp 00000000 00:00 0
2b2f4000-2b2f5000 rw-p 00000000 00:00 0
2b2f6000-2b2fb000 r-xp 00000000 fe:00 166802 /squashfs-root/hook/harness_debug.so
2b2fb000-2b30a000 ---p 00000000 00:00 0
2b30a000-2b30b000 rw-p 00004000 fe:00 166802 /squashfs-root/hook/harness_debug.so
2b30b000-2b30f000 r-xp 00000000 fe:00 144165 /squashfs-root/lib/libnvram-0.9.28.so
2b30f000-2b34f000 ---p 00000000 00:00 0
2b34f000-2b350000 rw-p 00004000 fe:00 144165 /squashfs-root/lib/libnvram-0.9.28.so
2b350000-2b363000 r-xp 00000000 fe:00 144157 /squashfs-root/lib/libdevmanager-0.9.28.so
2b363000-2b3a2000 ---p 00000000 00:00 0
2b3a2000-2b3a3000 r--p 00012000 fe:00 144157 /squashfs-root/lib/libdevmanager-0.9.28.so
2b3a3000-2b3a4000 rw-p 00013000 fe:00 144157 /squashfs-root/lib/libdevmanager-0.9.28.so
2b3a4000-2b3a5000 rw-p 00000000 00:00 0
2b3a5000-2b3b4000 r-xp 00000000 fe:00 144163 /squashfs-root/lib/libpthread-0.9.28.so
2b3b4000-2b3f3000 ---p 00000000 00:00 0
2b3f3000-2b3f8000 rw-p 0000e000 fe:00 144163 /squashfs-root/lib/libpthread-0.9.28.so
2b3f8000-2b3fa000 rw-p 00000000 00:00 0
2b3fa000-2b421000 r-xp 00000000 fe:00 144227 /squashfs-root/lib/libpcre.8.3.8.so
2b421000-2b460000 ---p 00000000 00:00 0
2b460000-2b461000 rw-p 00026000 fe:00 144227 /squashfs-root/lib/libpcre.8.3.8.so
2b461000-2b4be000 r-xp 00000000 fe:00 144154 /squashfs-root/lib/libssl.so.1.0.0
2b4be000-2b4cd000 ---p 00000000 00:00 0
2b4cd000-2b4d3000 rw-p 0005c000 fe:00 144154 /squashfs-root/lib/libssl.so.1.0.0
2b4d3000-2b672000 r-xp 00000000 fe:00 144233 /squashfs-root/lib/libcrypto.so.1.0.0
2b672000-2b681000 ---p 00000000 00:00 0
2b681000-2b697000 rw-p 0019e000 fe:00 144233 /squashfs-root/lib/libcrypto.so.1.0.0
2b697000-2b699000 rw-p 00000000 00:00 0
2b699000-2b732000 r-xp 00000000 fe:00 144174 /squashfs-root/lib/libuClibc-0.9.28.so
2b732000-2b771000 ---p 00000000 00:00 0
2b771000-2b772000 r--p 00098000 fe:00 144174 /squashfs-root/lib/libuClibc-0.9.28.so
2b772000-2b773000 rw-p 00099000 fe:00 144174 /squashfs-root/lib/libuClibc-0.9.28.so
2b773000-2b779000 rw-p 00000000 00:00 0
2b779000-2b77b000 r-xp 00000000 fe:00 144228 /squashfs-root/lib/libdl-0.9.28.so
2b77b000-2b7ba000 ---p 00000000 00:00 0
2b7ba000-2b7bb000 r--p 00001000 fe:00 144228 /squashfs-root/lib/libdl-0.9.28.so
2b7bb000-2b7bc000 rw-p 00002000 fe:00 144228 /squashfs-root/lib/libdl-0.9.28.so
因为AI分析的过程需要结合IDA逆向的结果,所以反编译结果越好读,那么AI分析的结果会更好,因此还可以针对具体情况,让AI帮你完善结构体,比如:
你帮我逆向一下goahead程序,主要任务是分析webs_t结构体,然后帮我定义好该结构体。我的建议是,你可以先试着分析goahead版本,然后查找该版本goahead webs_t结构体的定义。然后再根据该goahead具体的逆向分析情况,最终得到webs_t结构体的定义。你并不需要完成该结构体中所有成员的定义,主要是让websReadEvent函数中所有使用的成员都要有合适的定义。
给我生成了以下代码:
// webs_def.h
#ifndef WEBS_DEF_H
#define WEBS_DEF_H
/* Ring Queue Structure - Assumed based on standard GoAhead */
typedef struct ringq {
unsigned char *serv_p; /* Service get pointer */
unsigned char *buf; /* Ring queue buffer */
long size; /* Ring queue size */
long len; /* Length of data in queue */
long end_p; /* End of data pointer */
long increment; /* Growth increment */
long maxsize; /* Max size of queue */
} ringq_t;
/* webs_t Structure - Reverse engineered from goahead binary (MIPS) */
typedef struct websRec {
/* 0x00 (0) */
ringq_t header; /* Header dynamic string (Indices 0-6) */
/* 0x1C (7) - Padding/Unknown fields */
/* Indices 7 to 45 (39 ints = 156 bytes) */
int _pad_unknown[39];
/* 0xB8 (46) */
char type[64]; /* Mime type (Indices 46-61) */
/* 0xF8 (62) */
int _pad_62;
/* 0xFC (63) */
char *path; /* Path */
/* 0x100 (64) */
char *url; /* URL */
/* 0x104 (65) */
char *host; /* Host */
/* 0x108 (66) */
int _pad_66;
/* 0x10C (67) */
char *query; /* Query string (Also used for v92 from websUrlParse) */
/* 0x110 (68) */
int _pad_68;
/* 0x114 (69) */
char *authType; /* Authorization type (e.g., "Basic") */
/* 0x118 (70) */
char *password; /* Password (from Base64 decode) */
/* 0x11C (71) */
char *username; /* Username (from Base64 decode) */
/* 0x120 (72) */
char *cookie; /* Cookie string */
/* 0x124 (73) */
int _pad_73;
/* 0x128 (74) */
char *userAgent; /* User-Agent string */
/* 0x12C (75) */
char *protocol; /* Protocol (e.g., "http") */
/* 0x130 (76) */
char *protoVersion; /* Protocol Version (e.g., "HTTP/1.1") */
/* 0x134 (77) */
int socket; /* Socket file descriptor */
/* 0x138 (78) */
int listenFd; /* Listen socket descriptor */
/* 0x13C (79) */
int port; /* Port number */
/* 0x140 (80) */
int state; /* Connection state (1, 2, 4, 8) */
/* 0x144 (81) */
unsigned int flags; /* Connection flags (Bitmask) */
/* 0x148 (82) */
int _pad_82;
/* 0x14C (83) */
int contentLength; /* Content-Length */
/* 0x150 (84) */
int _pad_84;
/* 0x154 (85) */
char *cgiCommName; /* Temporary file path or CGI comm name */
/* 0x158 (86) */
int _pad_86;
/* 0x15C (87) */
int _pad_87;
/* 0x160 (88) */
int _pad_88;
/* 0x164 (89) */
int docFdOpen; /* Flag indicating if doc FD is open */
/* 0x168 (90) */
int _pad_90;
/* 0x16C (91) */
char *realm; /* Auth Realm */
/* 0x170 (92) */
char *nonce; /* Auth Nonce */
/* 0x174 (93) */
char *response; /* Auth Response */
/* 0x178 (94) */
char *uri; /* Auth URI */
/* 0x17C (95) */
char *opaque; /* Auth Opaque */
/* 0x180 (96) */
char *nc; /* Auth NC */
/* 0x184 (97) */
char *cnonce; /* Auth Cnonce */
/* 0x188 (98) */
char *qop; /* Auth QOP */
/* 0x18C (99) */
int _pad_99;
/* 0x190 (100) */
int queryInt; /* Integer conversion of Query (or related field) */
/* 0x194 (101) */
char *referer; /* Referer */
/* 0x198 (102) */
void *ssl; /* SSL Handle / Pointer */
/* 0x19C (103) */
int _pad_103;
/* 0x1A0 (104) */
int _pad_104;
/* 0x1A4 (105) */
int flags_105; /* Error or termination flag */
} websRec;
typedef websRec *webs_t;
#endif // WEBS_DEF_H
因为crash1的奔溃点位于websReadEvent函数,如果能定义好webs_t结构体,那么该函数的可读性会大大增加。
注:以上过程不是一开始我就会这么做的,而是根据具体情况,一步一步的最终结果。
准备工作做好后,就可以让AI帮忙分析crash成因了,如下所示:
我通过fuzz goahead程序得到几个crash数据,请你试着根据调试信息还有通过对程序的逆向分析帮我分析出crash的成因。已知信息有:
1. 内存布局信息:@mem.txt
2. crash数据放在crash目录下,txt后缀文件为对应crash的调试信息,目前你只需要分析crash1。
3. 当前目录下提供所有相关的binary,如果有需要,可以使用IDA对其进行逆向分析。
4. 使用中文把结果写入到crash1.md文档中。
最开始crash目录下有多个crash数据,我是尝试让AI一次性分析多个,最后发现效果不好,还容易token溢出。因此最终决定一个一个分析。
最终的分析文档如下所示:
# GoAhead Crash1 分析报告 (最终修正版)
## 1. 崩溃概览
* **程序名称**: `goahead`
* **崩溃信号**: Signal 11 (Segmentation Fault)
* **崩溃地址 (PC)**: `0x2b6e2630` (位于 `libuClibc-0.9.28.so` 的 `strlen` 函数中)
* **返回地址 (RA)**: `0x42b024` (位于 `goahead` 的 `websReadEvent` 函数中)
## 2. 详细分析
### 2.1 崩溃现场
* **崩溃点**: `websReadEvent` 函数中调用 `strlen(v87)` 的位置。
* **直接原因**: 传入 `strlen` 的参数 `v87` 为 `NULL`。
### 2.2 漏洞复现逻辑
该漏洞由特定的 HTTP 请求头组合触发,涉及状态机转换中的逻辑缺陷。
#### 步骤 1: 进入 State 2 (Header Processing)
程序首先读取 HTTP 头部数据,并将状态 (`state`) 设置为 2。
在随后的 `websReadEvent` 调用中,如果 `socketGets` 返回 0(没有读取到新的 socket 数据),程序会跳过数据读取块,直接进入 State 2 的处理逻辑:
c // v3 为 socketGets 返回值,此时为 0 if ( v3 ) { … } // 跳过 if ( a1->state != 2 ) goto LABEL44; // 进入 State 2 头部解析逻辑 websSetVar(a1, “HTTPAUTHORIZATION”, “”); v5 = a1->header.buf; // … 循环解析头部 …
**关键点**: 此时局部变量 `v87` 被初始化为 0 (NULL),且因为 `socketGets` 返回 0 而未被赋值。
#### 步骤 2: 解析畸形 Content-Length
请求中包含两个 `Content-Length` 头:
1. `Content-Length: 9#0` -> 解析出长度 9,并设置 `0x400` (WEBS_CL_HEADER) 标志。
2. `Content-Length: nk.asp` -> `atoi` 返回 0。代码逻辑将 `contentLength` 更新为 0,但**未清除** `0x400` 标志。
#### 步骤 3: 错误跳转
头部解析结束后,程序检查标志位:
c // 检查 0x400 标志 (由于第一个头存在,此标志被设置) if ( ((v14 >> 10) & 1) != 0 ) { v24 = a1->contentLength; // 此时为 0 (被第二个头覆盖) a1->state = 8; // 切换到 State 8 (WEBS_POST)
// 如果长度 <= 0,直接跳转到 LABEL38 处理数据 if ( v24 <= 0 ) goto LABEL38;
goto LABEL_4; }
程序错误地认为“既然有 Content-Length 且长度为 0,说明数据可能已经读取完了或者就在缓冲区里”,因此直接跳转到 `LABEL_38` 去处理“剩余”的数据。
#### 步骤 4: 空指针解引用
代码跳转到 `LABEL_38`,该位置是数据处理的入口。
c LABEL_38: v18 = a1->state; // 8 // … if ( v18 != 8 ) …
// State 8 处理逻辑 // 此时 v87 依然是 NULL else { v66 = a1->query; // … // 崩溃发生:strlen(NULL) v74 = strlen(v87); // … }
程序试图计算“新接收数据”的长度,直接对 `v87` 调用 `strlen`,导致崩溃。
## 3. 结论
这是一个逻辑漏洞。当 `goahead` 在处理 State 2 (解析头部) 时,如果遇到 `Content-Length` 为 0 (或解析失败为 0) 的情况,它会错误地假设数据指针 `v87` 有效,并直接跳转到数据处理逻辑 (`LABEL_38`),而未考虑到在该路径下(`socketGets` 返回 0)`v87` 实际上是空指针。
## 4. 修复建议
在 `LABEL_38` 后的 State 8 处理逻辑中,必须检查 `v87` 是否为空。
**修复代码示例:**
c // 在处理 State 8 的逻辑块中 if ( v18 == 8 ) { // … if ( v66 ) { // 添加空指针检查 if ( v87 ) { v74 = strlen(v87); // … brealloc … } } // … }
或者,在 State 2 判定跳转 `goto LABEL_38` 之前,确保不要在 `v87` 为空时跳转去处理数据。
到这,fuzz流程就结束了。可以看到,目前最顶级的Gemini大模型已经可以非常好的在fuzz的每个流程替代你工作。不过最后一个分析crash的流程,我还是更倾向使用本地大模型进行,而不是使用远程的大模型,期待本地大模型未来能低成本的达到当前Gemini3的性能。
往 期 热 门
(点击图片跳转)
戳“阅读原文”更多精彩内容!
免责声明:
本文所载程序、技术方法仅面向合法合规的安全研究与教学场景,旨在提升网络安全防护能力,具有明确的技术研究属性。
任何单位或个人未经授权,将本文内容用于攻击、破坏等非法用途的,由此引发的全部法律责任、民事赔偿及连带责任,均由行为人独立承担,本站不承担任何连带责任。
本站内容均为技术交流与知识分享目的发布,若存在版权侵权或其他异议,请通过邮件联系处理,具体联系方式可点击页面上方的联系我。
本文转载自:知道创宇404实验室 404实验室《原创 Paper | 使用 AI 辅助 fuzz 实战记录》
版权声明
本站仅做备份收录,仅供研究与教学参考之用。
读者将信息用于其他用途的,全部法律及连带责任由读者自行承担,本站不承担任何责任。










评论