EDA365电子论坛网

标题: HI3515的Unable to handle kernel paging request错误 [打印本页]

作者: Divingbear    时间: 2013-2-1 12:39
标题: HI3515的Unable to handle kernel paging request错误
最近在用海思的HI3515设计一款工控机,第一版做完后,开始开发第二版,主要改动是增删了部分硬件功能,DDR颗粒从原来的HY5PS1G1631CFP-Y5C换成了H5PS1G63EFRY5C。
0 A% f! g( N7 S( w 结果发现第二版的程序在引导过程中偶尔会出现Unable to handle kernel paging request的错误,且后边的地址经常会变化,并不是固定的错误。错误的引导信息如下:0 f+ f" q8 y7 I! V
Unable to handle kernel NULL pointer dereference at virtual address 0000003c
2 b6 G. t0 I$ p" B+ ~ pgd = c1de4000) }! T; B2 B2 L) p" ]
[0000003c] *pgd=c1de3031, *pte=00000000, *ppte=00000000# |; P% T5 q+ K7 G
Internal error: Oops: 817 [#1]
& s+ u0 [2 `3 e, N Modules linked in:9 r+ Q6 T. H% j3 C! Y
CPU: 0    Not tainted  (2.6.24-rt1-hi3515v100 #24)) L/ F$ e& F$ N9 r1 H+ E9 c5 ~6 \7 B
PC is at generic_file_aio_read+0x20/0x1a4
$ x1 N' J0 [1 T" m6 E LR is at do_sync_read+0xc8/0x114
+ u; ]' J8 l+ X+ ~$ ` pc : [<c0060904>]    lr : [<c007cc1c>]    psr: a00000137 R% l  M& b4 t& b
sp : c1dd9d18  ip : 00000000  fp : c1dd9d60
: ^( R' O, r& X4 d8 T2 i r10: 00000000  r9 : 00000000  r8 : c1dd9e50
- l' M% B7 w$ r+ u r7 : c1dd9d6c  r6 : c1dbb660  r5 : c1dd9dbc  r4 : c1dd9d747 U0 Q; k  _# g0 C- d4 M2 j
r3 : 00000000  r2 : 00000001  r1 : 0000003c  r0 : c1dd9d74
$ \$ i) r1 `( S. \ Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
) E8 F, r8 s- S2 p Control: 0005317f  Table: c1de4000  DAC: 00000017
+ h" M) q8 ]9 Y4 O" ` Process rcS (pid: 219, stack limit = 0xc1dd8258)+ A6 a- R8 Z6 S9 c' k9 G+ i
Stack: (0xc1dd9d18 to 0xc1dda000)5 x/ V/ f) o5 q5 x+ Z7 T6 S
9d00:                                                       c0093cc4 c00f95cc9 {5 y8 U) T6 }. A( Q
9d20: 00000000 c1dd9e28 c1dd9da4 c1dd9d38 c0085be8 c0093cb4 c1dd9d74 c1dd9dbc% C! O1 q& C( @* G
9d40: c1dbb660 c1dd9d6c c1dd9e50 00000000 00000000 c1dd9e24 c1dd9d68 c007cc1c+ a3 \: Z3 [. R9 [$ d
9d60: c00608f8 00000000 00000000 c1d6bdc0 00000080 00000000 c1dd9fb0 000000005 A$ U0 b) D, X+ i2 n7 j7 J! O' _
9d80: 00000001 ffffffff c1dbb660 00000000 00000000 00000000 00000000 c1c5c0e0
$ M! S* H0 h" o: h 9da0: 00000000 00000000 c1dd9de0 c1c5c0e0 c0049d48 c1dd9db4 c1dd9db4 00000000
- g" Q7 m9 m+ N 9dc0: 00000000 c0044454 c0044004 c1dd9df8 c1dbb660 c1dd9e08 00000080 c007bde09 R. `3 O2 k3 ]/ X* s
9de0: c0065b7c c1dbb660 c1dd9e28 c1dd9fb0 c1dd8000 c0294ac8 c1dd9fb0 00000080$ t1 s" ~" F0 F5 V
9e00: c1dbb660 c1d6bdc0 c1dd9e50 c007cb54 00000000 c1dd9fb0 c1dd9e4c c1dd9e28& b9 E: Z1 G, U3 z0 b
9e20: c007cd14 c007cb64 c1801b40 bf000000 00000000 c1dd9e50 c1dd8000 c0294ac8  E# h1 p; _/ E0 r
9e40: c1dd9e70 c1dd9e50 c00814b8 c007cc78 00000000 00000000 00000080 c1d6bdc01 i, O/ ]3 p% w! o) y
9e60: 00000000 c1dd9e8c c1dd9e74 c0081d24 c0081474 c1dd9e98 c1d6bdc0 c1dd9fb06 i( u: q( X/ w/ v
9e80: c1dd9f30 c1dd9e90 c00ad608 c0081c98 c1d6bdc2 00000000 6e69622f 0068732f
, B7 Z) C# P2 O. m6 B 9ea0: c1db2320 beffff7b c1dd9ee8 c1dd9eb8 c006e3e4 00000000 00000020 000000007 ?( _8 u; l* @, ?4 A* k1 q  B" i4 y
9ec0: c1dd8000 00000001 00000000 c1d6bdc0 00000000 00000000 00000000 c1dd9f1c
& J" k- w5 ?4 t( Q$ k$ y. K 9ee0: 00000000 00000000 00000000 c1dd9fb0 00000000 c1dd9efc 00000000 00000f7b
- c0 W% \4 J2 `( `4 r; o 9f00: c1dd9f1c c1dd9f10 c0294e54 c1d6bdc0 fffffffe c1dd8000 c0294e34 c1d6bdc0
3 {* i. p/ R5 F5 x" A! F) f 9f20: fffffff8 c1dd9f5c c1dd9f34 c0081fc0 c00ad3fc c1d6bdc0 beffff92 00000000& \$ f7 D3 |1 s  q. U5 w4 |
9f40: 001d7bcc 001d7bc4 c1dd8000 c1dd9fb0 c1dd9f84 c1dd9f60 c0082248 c0081f2c+ j3 o- p- [# v1 T
9f60: c1c99000 001d7bcc c1dd9fb0 c1c99000 c001ffe4 001c71fc c1dd9fa4 c1dd9f88& H/ m' b4 c% k/ d
9f80: c00237c4 c0082124 001d7bc4 001d7ba4 00000001 0000000b 00000000 c1dd9fa8
% Q7 q1 {/ M% S( D: m# A' ~ 9fa0: c001fe40 c0023798 001d7bc4 001d7ba4 001d7ba4 001d7bc4 001d7bcc 001d7ba4* [9 }9 m+ y: A! ^
9fc0: 001d7bc4 001d7ba4 00000001 001d7bcc 001d7bcc 00000001 001c71fc 00000000( K! u. F9 q* _9 k' u, m: K( x
9fe0: 400833c4 be813a44 00069d44 400833cc 20000010 001d7ba4 00000000 00000000& b  p5 [+ w* w2 q
Backtrace:
/ \6 x  g( `/ p" p) D' ^% S7 G1 F [<c00608e8>] (generic_file_aio_read+0x4/0x1a4) from [<c007cc1c>] (do_sync_read+0xc8/0x114)# u2 ]3 {3 g' H# r7 b
[<c007cb54>] (do_sync_read+0x0/0x114) from [<c007cd14>] (vfs_read+0xac/0x144)# }: p# ]4 G1 t$ ^2 z5 z; ^
[<c007cc68>] (vfs_read+0x0/0x144) from [<c00814b8>] (kernel_read+0x54/0x84)2 w4 r7 H& {# p1 [) S0 \8 g
  r8:c0294ac8 r7:c1dd8000 r6:c1dd9e50 r5:00000000 r4:bf000000
% g: k' {/ e& s9 }5 e. y" A1 e [<c0081464>] (kernel_read+0x0/0x84) from [<c0081d24>] (prepare_binprm+0x9c/0x108). `. H5 ?2 K  ]/ Z% F. j
  r6:00000000 r5:c1d6bdc0 r4:000000801 ~  I2 J" v0 k) o' a2 H
[<c0081c88>] (prepare_binprm+0x0/0x108) from [<c00ad608>] (load_script+0x21c/0x240)6 W( n5 e6 P6 u! E& |
  r6:c1dd9fb0 r5:c1d6bdc0 r4:c1dd9e981 ?+ J5 `( x9 S8 Y
[<c00ad3ec>] (load_script+0x0/0x240) from [<c0081fc0>] (search_binary_handler+0xa4/0x1f8)
6 n8 p0 w" m7 C3 L5 q% o% @  r6:fffffff8 r5:c1d6bdc0 r4:c0294e34
/ F+ s8 q" e4 a [<c0081f1c>] (search_binary_handler+0x0/0x1f8) from [<c0082248>] (do_execve+0x134/0x184)
, l/ X7 L: V2 q) M. t' @; D [<c0082114>] (do_execve+0x0/0x184) from [<c00237c4>] (sys_execve+0x3c/0x5c)4 o; d; W% l$ |2 a' q
[<c0023788>] (sys_execve+0x0/0x5c) from [<c001fe40>] (ret_fast_syscall+0x0/0x2c)$ [0 G; k8 u3 N6 E5 _
  r7:0000000b r6:00000001 r5:001d7ba4 r4:001d7bc4
4 X2 J/ q5 ?# `. [  M# Q Code: e24dd020 e3a0c000 e58b3004 e1a04000 (e1a0a001)$ ]- d" q; ^; y  |# F6 |/ S! c
---[ end trace e292f16b7bf51848 ]---
% O5 d4 k4 c' K0 F6 H- y Segmentation fault
* Q! g0 V( ?% A, k2 s我先是把第一版和第二版上的NAND FLASH颗粒进行互换,发现新旧板上的颗粒在第一版的板子上都没有报错,而两个颗粒在第二版的板子上都会出现报错,初步确定问题是随着板子走的,而不是颗粒。$ X6 _$ E$ p* B4 F6 z
然后将第二版上的硬件改动还原,尽量还原到和第一版一样,然后启动还是会出现报错信息,说明外围硬件的改动没有影响到引导驱动程序出错。' q  {7 z& h) x+ T$ o2 V3 o2 D
现在在把新板子寄回工厂,准备把DDR颗粒换成和第一版一样,然后再尝试一下,这个需要一些时间。
* z1 }$ l( c! y3 A 然后比对了前后两种DDR颗粒的spec,未见明显的参数差别。$ p3 z: [, Q& O0 s1 X( c

" p! \$ ^% x8 b5 U
/ f" t7 r) s$ T  J% e: F" ?PS:8 P8 e1 h5 I  U! f- ]+ `, _( @; n
这几天集中研究了一下这个问题,汇报一下:
3 o  ?# `* x) J 1.启动UBOOT后,用网络下载后边的文件到内存中的方式来启动,问题依旧会出现,排除掉NAND flash的问题。" P; }3 j, C8 a7 w
2.抓了1.8V,VREF_DDR的电压波形,没有发现drop和ripple.排除电压的问题4 i& f( f: F( |0 h1 Q
3.把老板子上HY5PS1G1631CLFP-Y5C和新板子上的H5PS1G63EFR-Y5C互换了一下。旧版运行到现在一次都没出错
. s7 G1 l2 ?! D9 L& L 而新版运行到现在报过一次错。8 e6 G: y( O% R) f( @
4.买了几颗HY5PS1G1631CFP-Y5C(与原来板子上DDR比较少了一个L,原来的DDR颗粒停产了,现在这个颗粒IDD6电流略高,其他参数一样)换到板上,暂时还没看到报错,还需多跑跑看。' E3 D, X: ~! i3 _
5.基于以上的实验,感觉是layout的问题,查了新旧版的走线,是完全一样的,etch length和manhattan length一样。
! A" G' x) w1 Z4 {% B$ y) [5 L 6.准备再次比对堆叠厚度,因为前后两版PCB是不同的厂子压合的,板子堆叠厚度是根据我提供的匹配阻抗由板厂自己计算的。
作者: wenqing89    时间: 2015-7-29 16:28
牛人,不过3515 支持的DDR2,最大200MHz吧,对布线要求这么高嘛?




欢迎光临 EDA365电子论坛网 (https://bbs.eda365.com/) Powered by Discuz! X3.2