본문으로 건너뛰기
AXyNowAX IS NOW
개인 의사결정EP5

🚗차사고·자동차 분쟁

Car accident·dispute

외피 — 산업 도메인
차사고·보험 분쟁
내용 — 측정하는 AI 능력
  • · 과실비율 추론 (도로교통법)
  • · 보험·구상 절차
  • · 형사·민사 분기

모델별 종합 점수

✓ 챗봇 1턴

측정일 2026-06-05T02:40:53+00:00 · 5개 항목 × 100점 기준

채점자 editor · max_tokens 32768 · temp 0.7 · attempts 3 · reasoning_effort medium

모델
1Google GeminiGemini 3.1 Pro
5/5929090939091.3
2ClaudeClaude Opus 4.8
5/5889090909189.8
3OpenAIGPT-5.5
5/5908888909189.4
4Google GeminiGemini 3.5 Flash
5/5898886909088.8
5DeepSeekDeepSeek V4 Pro
5/5898887899088.5
6ClaudeClaude Sonnet 4.6
5/5888785879087.3
7QWenQwen 3.7 Max
5/5858382848684.0
8Moonshot AIKimi K2.6
5/5818078788279.6
9Google GeminiGemma 4 12B
5/5788078768278.0
10Google GeminiGemma 4 31B
5/5787776778077.5
11OpenAIGPT-5.4 Mini
5/5797876767977.3
12Google GeminiGemma 4 26B A4B
5/5787775758076.7
13MiniMaxMiniMax M3
5/5668785659276.4
14Google GeminiGemini 3.1 Flash Lite
5/5767674767976.0
15XiaomiMimo V2.5 Pro
5/5767674748175.8
16QWenQwen 3.7 Plus
5/5718273698574.6
17GLM 5.1
5/5757772728074.4
18DeepSeekDeepSeek V4 Flash
5/5747471717773.1
19StepFunStep 3.7 Flash
5/5638871638972.2
20xAIGrok 4.3
5/5737370697671.6
21NVIDIANemotron 3 Ultra 550B
5/5638263608868.4
22QWenQwen 3.6 27B
5/5577679558868.0
23Solar Pro 3
5/5587158508060.8
24LG AIEXAONE 4.5 33B
5/5546955497758.4
25NaverHyperCLOVAX SEED Think 32B
5/5486448528056.4
26QWenQwen 3.5 9B
5/5426855357852.2
27QWenQwen 3.6 35B A3B
5/5525348466151.0
28Mistral AIMistral Small 4
5/5515547406048.8
29Google GeminiGemma 4 E2B
5/5364734344538.4
30KakaoKanana 2 30B-A3B Thinking
5/5304437314536.2
31NaverHyperCLOVAX SEED 1.5B
5/5334532324335.6
32Liquid AILFM2.5 8B-A1B
5/5274225244030.0

문항별 점수

5 문항

각 문항당 모델 세부 점수. 응답 원문·근거는 문항 카드 우측 링크.

차사고·자동차 분쟁 · 문항 1고속도로 추돌 가해자 + 12주 진단서 대응공개

고속도로 추돌 가해자 + 12주 진단서 대응

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Gemini 3.1 ProGoogle
929090929091
Claude Opus 4.8Anthropic
889090889089
GPT-5.5OpenAI
888885889088
Gemini 3.5 FlashGoogle
888885888887
DeepSeek V4 ProDeepSeek
908888908889
Claude Sonnet 4.6Anthropic
888885888887
Qwen 3.7 MaxAlibaba
858582858584
Kimi K2.6Moonshot
808078788279
Gemma 4 12BGoogle
788078788479
Gemma 4 31BGoogle
807878808280
GPT-5.4 MiniOpenAI
828078808080
Gemma 4 26B A4BGoogle
787875758077
MiniMax M3Minimax
508282459266
Gemini 3.1 Flash LiteGoogle
757572727874
Mimo V2.5 ProXiaomi
787875758077
Qwen 3.7 PlusAlibaba
828482828483
GLM 5.1Z.ai
787875758077
DeepSeek V4 FlashDeepSeek
757572727874
Step 3.7 FlashStepFun
508555488861
Grok 4.3xAI
757572707573
Nemotron 3 Ultra 550BNVIDIA
568058508863
Qwen 3.6 27BAlibaba
557075508564
Solar Pro 3Upstage
607260528263
EXAONE 4.5 33BLG AI
607060557863
HyperCLOVAX SEED Think 32BNaver
406040408049
Qwen 3.5 9BAlibaba
407055327851
Qwen 3.6 35B A3BAlibaba
656562607064
Mistral Small 4Mistral
606055506557
Gemma 4 E2BGoogle
324530284334
Kanana 2 30B-A3B ThinkingKakao
405845405546
HyperCLOVAX SEED 1.5BNaver
274125253930
LFM2.5 8B-A1BLiquid AI
294327274132
차사고·자동차 분쟁 · 문항 2자차보험 청구 vs 대물 청구 차이비공개

자차·대물·대인 청구 트리

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Gemini 3.1 ProGoogle
908888908889
Claude Opus 4.8Anthropic
889290929291
GPT-5.5OpenAI
888888889088
Gemini 3.5 FlashGoogle
888885909088
DeepSeek V4 ProDeepSeek
858582858885
Claude Sonnet 4.6Anthropic
888885859087
Qwen 3.7 MaxAlibaba
808078808280
Kimi K2.6Moonshot
787875758077
Gemma 4 12BGoogle
747874728075
Gemma 4 31BGoogle
757572727874
GPT-5.4 MiniOpenAI
787875758077
Gemma 4 26B A4BGoogle
757572727874
MiniMax M3Minimax
809090859287
Gemini 3.1 Flash LiteGoogle
727270707571
Mimo V2.5 ProXiaomi
787875758277
Qwen 3.7 PlusAlibaba
788484748680
GLM 5.1Z.ai
757572727874
DeepSeek V4 FlashDeepSeek
757572727874
Step 3.7 FlashStepFun
729080759080
Grok 4.3xAI
727270687571
Nemotron 3 Ultra 550BNVIDIA
708678748878
Qwen 3.6 27BAlibaba
657580709075
Solar Pro 3Upstage
607260528062
EXAONE 4.5 33BLG AI
557058507559
HyperCLOVAX SEED Think 32BNaver
608060608066
Qwen 3.5 9BAlibaba
628072608269
Qwen 3.6 35B A3BAlibaba
606058556559
Mistral Small 4Mistral
505548406049
Gemma 4 E2BGoogle
405038384842
Kanana 2 30B-A3B ThinkingKakao
526852507257
HyperCLOVAX SEED 1.5BNaver
394938384741
LFM2.5 8B-A1BLiquid AI
314529294434
차사고·자동차 분쟁 · 문항 3음주 운전 적발 시 행정·형사·보험 영향비공개

음주 0.03~0.08~0.2 단계별 처벌

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Gemini 3.1 ProGoogle
929088929291
Claude Opus 4.8Anthropic
889088909089
GPT-5.5OpenAI
908888929290
Gemini 3.5 FlashGoogle
888885908888
DeepSeek V4 ProDeepSeek
888885889088
Claude Sonnet 4.6Anthropic
888885909088
Qwen 3.7 MaxAlibaba
828078828581
Kimi K2.6Moonshot
858282828583
Gemma 4 12BGoogle
788078788279
Gemma 4 31BGoogle
787875788078
GPT-5.4 MiniOpenAI
787875757876
Gemma 4 26B A4BGoogle
807878808280
MiniMax M3Minimax
658575709276
Gemini 3.1 Flash LiteGoogle
727270707571
Mimo V2.5 ProXiaomi
727270707872
Qwen 3.7 PlusAlibaba
748274788478
GLM 5.1Z.ai
787875788078
DeepSeek V4 FlashDeepSeek
727270707571
Step 3.7 FlashStepFun
628878608872
Grok 4.3xAI
707268687570
Nemotron 3 Ultra 550BNVIDIA
768470708876
Qwen 3.6 27BAlibaba
707580709076
Solar Pro 3Upstage
607260558264
EXAONE 4.5 33BLG AI
556855507859
HyperCLOVAX SEED Think 32BNaver
404060408050
Qwen 3.5 9BAlibaba
385858327549
Qwen 3.6 35B A3BAlibaba
555552506054
Mistral Small 4Mistral
505545356047
Gemma 4 E2BGoogle
344532324336
Kanana 2 30B-A3B ThinkingKakao
254035254532
HyperCLOVAX SEED 1.5BNaver
304329294133
LFM2.5 8B-A1BLiquid AI
233821193826
차사고·자동차 분쟁 · 문항 4주차장 사고 — CCTV·블랙박스 증거 활용비공개

주차장 사고 책임 입증

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Gemini 3.1 ProGoogle
929090959092
Claude Opus 4.8Anthropic
869092889089
GPT-5.5OpenAI
908888889289
Gemini 3.5 FlashGoogle
908888929090
DeepSeek V4 ProDeepSeek
909088909290
Claude Sonnet 4.6Anthropic
858582859085
Qwen 3.7 MaxAlibaba
858280858884
Kimi K2.6Moonshot
787875758077
Gemma 4 12BGoogle
808080808280
Gemma 4 31BGoogle
807878808280
GPT-5.4 MiniOpenAI
787875757876
Gemma 4 26B A4BGoogle
807878788279
MiniMax M3Minimax
709088659078
Gemini 3.1 Flash LiteGoogle
787878788078
Mimo V2.5 ProXiaomi
757575758276
Qwen 3.7 PlusAlibaba
487838428254
GLM 5.1Z.ai
607055507560
DeepSeek V4 FlashDeepSeek
707268657569
Step 3.7 FlashStepFun
628862608869
Grok 4.3xAI
727268707571
Nemotron 3 Ultra 550BNVIDIA
427232408450
Qwen 3.6 27BAlibaba
407575458560
Solar Pro 3Upstage
506852427855
EXAONE 4.5 33BLG AI
456548427552
HyperCLOVAX SEED Think 32BNaver
406020608051
Qwen 3.5 9BAlibaba
185520157532
Qwen 3.6 35B A3BAlibaba
253020205027
Mistral Small 4Mistral
455040355543
Gemma 4 E2BGoogle
384836364640
Kanana 2 30B-A3B ThinkingKakao
254035254532
HyperCLOVAX SEED 1.5BNaver
374736364539
LFM2.5 8B-A1BLiquid AI
254123213928
차사고·자동차 분쟁 · 문항 5보험금 사기 의심 시 대응 (작은 사고 큰 청구)비공개

사기 의심 시 보험사 신고 동선

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Gemini 3.1 ProGoogle
959292959294
Claude Opus 4.8Anthropic
909092929291
GPT-5.5OpenAI
929092929292
Gemini 3.5 FlashGoogle
908888929290
DeepSeek V4 ProDeepSeek
929090929291
Claude Sonnet 4.6Anthropic
908888889289
Qwen 3.7 MaxAlibaba
928890909090
Kimi K2.6Moonshot
828280828582
Gemma 4 12BGoogle
788078728477
Gemma 4 31BGoogle
787875758077
GPT-5.4 MiniOpenAI
787875757876
Gemma 4 26B A4BGoogle
757572728074
MiniMax M3Minimax
658888589275
Gemini 3.1 Flash LiteGoogle
858282888585
Mimo V2.5 ProXiaomi
787875758277
Qwen 3.7 PlusAlibaba
748486688878
GLM 5.1Z.ai
858282858584
DeepSeek V4 FlashDeepSeek
787875758077
Step 3.7 FlashStepFun
709080749079
Grok 4.3xAI
757570707873
Nemotron 3 Ultra 550BNVIDIA
708676649075
Qwen 3.6 27BAlibaba
558584389065
Solar Pro 3Upstage
587056508060
EXAONE 4.5 33BLG AI
557055487859
HyperCLOVAX SEED Think 32BNaver
608060608066
Qwen 3.5 9BAlibaba
527870388060
Qwen 3.6 35B A3BAlibaba
555550456052
Mistral Small 4Mistral
505545406048
Gemma 4 E2BGoogle
384836364640
Kanana 2 30B-A3B ThinkingKakao
101520151014
HyperCLOVAX SEED 1.5BNaver
324431314235
LFM2.5 8B-A1BLiquid AI
274226244030