|
|
EDA365欢迎您登录!
您需要 登录 才可以下载或查看,没有帐号?注册
x
+ V- k4 Q6 g9 B1 z F9 l引言$ ~ i& ?1 `! B9 f+ f
图像视频处理等多媒体领域是FPGA应用的最主要的方面之一,边缘检测是图像处理和计算机视觉中的基本问题,所以也是最常用的,随着数据量的不断增加以及对实时性的要求,一般软件已经不能满足实际需要,这时,就需要专门的硬件来实现加速。本小节就实现一个简单的sobel边缘检测加速器,为了便于对比,我们还编写对应的软件算法。& G) B2 P- P: s. }
' a. l l. B* q( H0 p. R
) Q* p& y. W( \# _
& c3 ]5 h2 F; p; D) ?; Q1,基本思想与算法4 G6 D" t* c4 [ [% S- G# i! i
Sobel检测法通过一个叫做卷积的过程来估计每个像素点每个方向上的导数值。把中心像素点和离它最近的八个像素点每个乘以一个系数后相加。该系数通常用一个 的卷积表(convolution mask)来表示。分别用于计算x和y方向导数值的Sobel卷积表 Gx和 Gy 如下图所示。
l* R- |" D8 z" W- f
" K' p8 z, f' T, x ^Gx:
0 F2 Q1 R" m0 Z: g4 W( q) ~ Y- A% N1 n% |7 y
Gy:
+ f; B6 u5 g1 R4 `
: v, f5 M6 z6 z' f/ t4 @
6 U# }: R: y7 C( r8 V. p6 T/ Q" u3 W
我们把每个像素值分别乘以卷积表中对应的系数,再把相乘得到的九个数相加就得到了x方向和y方向的偏导数值 Dx和 Dy。然后,利用这两个偏导数值计算中心像素0 \, L& e) q3 N* P1 o
点的导数。计算公式如下:0 i( X4 [9 g' N' Q& v$ x4 E
' ~0 R+ I- ]* d& w8 C/ T9 v% s
( W$ a; \+ q1 x! V
* J6 ?/ G) V' [7 f) {2 t
由于我们只想找到导数幅值的最大值和最小值,对上式作如下简化:6 l/ s( f9 v4 Q9 K
* D( f2 t" l7 \6 I G& w1 ~. L+ R
3 j- E- K: B& p0 i- @
- [* f8 w! f ]; H4 }
这样近似能够满足计算要求,因为开平方和平方函数都是单调的,实际计算幅度的最大值、最小值与近似以后计算的最大值、最小值发生在图像的同一个地方。并且,与计算平方和开平方相比,计算绝对值所用的硬件资源少得多。$ g" R0 v* ~ F6 P) N, {* o
8 C: B9 p7 l/ p; v- B1 y8 ]$ P
我们需要重复地计算图像中每个像素位置的导数幅值。但是,注意到环绕图像边缘的像素点并没有一个完整的相邻像素组来计算偏导数和导数,所以我们需要对这些像素进行单独处理。最简单的方法就是把图像中边缘像素点的导数值值 |D|设置为0。这可以通过软件来完成。7 e! t% S6 t O! ?! n6 N
- J5 z6 I- ^# z! R+ [$ d- z: j我们用伪代码来表示该算法。令O[row][col] 表示原始图像的像素点,D[row][col]表示导数图像的像素点,row的范围从0到hight,col的范围从0到width。同时令Gx[j] 和 Gy[j] 表示卷积表,其中i 和 j 的范围从 -1 到 1. $ F6 F3 Q% f8 Y3 q
a: k9 W: U- h# ?; D
0 M9 [9 w/ u! J9 N+ T; l3 F3 F0 X# v5 _* {9 T
for( row=1; row<=hight; row=row+1 )$ H7 N" k9 @! |4 Y5 ]' C+ Y8 A
{
' l9 a. e/ c$ _* e- I) E for(col=1; col<=width; col=col+1)' d6 R& u" U& O4 T$ E
{ q* c" c8 T+ |& u) t! G
sumx=0; sumy=0;- J J+ V' p. `4 [% y9 o. N
for( i = -1; i <= +1; i = i+1) ' F+ Q7 r5 h! y3 \3 t
{ & N4 D9 N8 o# N8 `
for (j = -1; j<= +1; j = j+1)
- \( ^2 l7 p2 [7 ]4 ` {
0 k+ a; M/ s: [ ]6 i sumx = sumx + O[row+i][col+j] * Gx[j];9 a: ?$ k* x! e% M: E: b
sumy = sumy + O[row+i][col+j] * Gx[j];
8 K+ z" R8 o; R- P0 w$ K }$ n/ \5 t! L* e# r5 q' F
}1 Z" C; x+ V3 O) N2 b% {
D[row][col] = abs(sumx) + abs(sumy)
8 t; H; v) `/ y4 n8 f; L }
: Y' K& l1 l+ ~8 c2 i$ | }7 o& F7 P/ p! _2 m; k, s3 m
: \3 W; e7 L S2 l. _2 D4 a
" r `, i! t8 Y; l# g
" n% X/ g1 O% w: o+ t8 n) T) C3 w8 L: H4 ~/ I
2 K' e) A0 g8 @1 d: j c
* f# F& j$ V6 K) X9 q Q9 Q) R
2,硬件实现) W4 k" O7 D& v' {
一旦明白了sobel的实现算法,我们就可以根据算法来划分模块了,本例中将sobel分成了:addr_gen,compute,machine,sobel_salve四个不同的子模块和sobel一个顶层模块。
/ ?( F" N% n2 _" ^& n0 G% w3 g6 q: \6 A8 c$ U, c/ `7 ?1 p
需要说明的是,sobel加速器最终要挂载SoC的总线上,所以需要一个统一的接口总线,本例中采用wishbone总线,关于wishbone总线的内容,请参考:http://blog.csdn.net/rill_zhen/article/details/8659788( {, D* D4 E# b
, B$ a$ p5 L- S! p% T& `
此外,由于sobel加速器只加速核心的数据处理部分,但是,对于一张完整的图片(例如bmp)还有其文件头,所以bmp的文件头的数据需要事先去掉,当sobel处理完后也需要再加上文件头之后才能用图片阅读器打开。所以我们需要编写相应的软件程序。
; f+ j+ b+ \6 L/ k. v9 T1 j
% B5 b6 i i% Q8 s2 P; B+ `, }* a5 L C1 \0 S下面只给出verilog代码清单:2 P; Q1 r2 S9 O/ w
+ a. }8 T2 C1 X! R/ w6 T& W6 |5 _6 h/ Z8 V$ J8 @) O& m
6 e1 N7 d- n' asobel.v* R! D1 U- }+ H- ~8 C+ p
8 ~, V* i0 I" s {5 d5 a0 S) i4 g$ S' a* `/ ^# t, }! l
; o7 |# v( g9 [. e- O' G+ I
`timescale 10ns/10ns# ~1 F J! F8 f# d
module sobel (
; x0 a' C2 A9 ^, a //master slave share i/f9 E5 @/ v$ e7 a* _) ~
clk_i,
) E( B) N* Q6 ~5 Y rst_i,4 i/ j5 D5 Z, k" a; f5 V1 S
dat_i,) I, q! r1 r. L0 P I! _3 B2 ^
dat_o,
" Q. ~+ }5 a, O' g* C. Q$ X3 @5 u //master i/f, F) d+ u7 f; v' E, [+ D' ?
cyc_o,6 k5 `5 B3 c, s# [6 i
stb_o,* Z! G: _" }5 S3 s! I) w m) M( I8 T
we_o,0 G, I2 F; e& S* Z2 N- `: G# m
adr_o, ~ K) g2 O9 J9 r# Q4 n; j
ack_i,
* b, v$ l( E8 E; s: H* r7 G) [( h //slave i/f
' C3 x4 y! P, H0 n& f- y2 P cyc_i,
+ ^# m) L& |% f stb_i,1 M! D& b2 q) K7 E3 a% t
we_i,% ?* `- X7 C2 H1 o8 s
adr_i,' x* G% o* B1 f; w" s, h
ack_o,
) Q# p- J- q2 {1 ] //debug i/f
/ y# v) U7 _) V, l% T3 T& X8 W1 a$ ~ prev_row_load,
a/ T" d( e3 t0 c% A curr_row_load,
) B" D2 D" h5 _0 I! b$ ] next_row_load,: A1 H0 i2 y8 ~. S- k6 l6 e
current_state,. g0 p& U8 z( U
int_req//interrupt
' `# e+ _8 F- O; H# f9 t/ y. J& A );& J- d% P: r# X9 f# w I2 t
: l( W D0 P$ m Q6 H1 L- ^# uinput clk_i;
! n1 q, P( I4 @( sinput rst_i;
" O; @8 g! x& S0 ^, A. B2 Jinput[31:0] dat_i;. v- j& J w. Z+ H! d3 R: E! q# w1 h
output[31:0] dat_o;
! s \+ P% a& o3 Boutput cyc_o;
: x) O' |; y' k; _$ L1 h8 N, T$ doutput stb_o; n3 B5 A, W, O6 p% Z" }, K6 `* Q
output we_o;
0 }4 f" |0 J6 O$ T z4 C- youtput[21:0] adr_o;; {, Y* `# F2 P
input ack_i;
4 f/ t) ?# Q/ y2 {8 ~input cyc_i;
. d' t; I& c$ binput stb_i;% B9 y% I. O5 e8 ?0 H
input we_i;
! T2 A; A$ ^/ s( v( L6 {, X# Qinput[21:0] adr_i;/ J$ Y" i# [8 f1 o2 m
output ack_o;- C; z8 B4 ~1 _6 N3 V9 C
output prev_row_load;: Y3 Q5 f, f" n. Z9 G
output curr_row_load;, w. a: w" M; t( ?) J0 {+ ?4 W
output next_row_load;
0 ?8 X' Z' Q5 y+ w7 E% Uoutput[4:0] current_state;
7 O* E5 G7 ~( _# T/ z& goutput int_req;2 N" n4 r; |. i+ ~2 U. D
, T3 l$ G' q8 f8 C# K2 M7 ]" y
wire start;1 ^' \2 k4 z& ~7 a3 o. Y( E
wire ack_i;' J% T1 W3 a6 y% y! j, R0 o
wire cyc_o;
4 ~% ^& C3 I, A$ V+ uwire stb_o; G2 K' Q3 W5 g0 e
wire we_o;- x- L% X/ W0 D. V6 s: W
wire O_base_ce;- z0 Q, J2 K u y7 Q
wire D_base_ce;
/ s$ o! m! W" b8 l+ nwire done_set;
7 e( x$ Z# |- V7 ^7 I5 lwire shift_en;
4 C- z% U& D+ l; Z; V- k6 wwire prev_row_load;6 A; j, M- f: l4 s+ \' x
wire curr_row_load;1 P0 F" l5 B Q ]
wire next_row_load;
) }- P% m$ R/ h* _wire O_offset_cnt_en;; ^+ W5 U0 V% q5 j& t$ Y' g
wire D_offset_cnt_en;, [2 J2 r, r; H
wire offset_reset;
& ]: L+ Z$ E8 d5 z7 X9 Z% M( S* qwire[31:0] result_row;0 e3 A7 H: A. W/ I2 Q. M
wire[31:0] dat_i;. b% T; z' T4 s
wire[21:0] adr_o;+ Y2 G9 z* k8 [* d) u. j. d
wire ack_o;. s# b- ^& B: h% U5 I# Q3 e
wire[31:0] dat_o ;
6 ~2 x1 h9 u6 m 4 e+ G6 t% r: [; F( s
wire[4:0] current_state;& p1 U2 M! Y8 r, x( s- r2 e
$ B- u; U Y6 p+ l
//module sentence5 I `6 V& X, ?1 p) g) x; A
9 ^( r% r$ v ?. vcompute compute(
/ Q: D0 U. n) L; m .rst_i(rst_i),
7 [+ p) ]. N8 C .clk_i(clk_i),
; R0 P3 D2 C; d$ u/ | .dat_i(dat_i),/ u7 ~. p4 ^$ b
.shift_en(shift_en),* ?% l# L6 T. |2 r6 t, E" t/ z$ t
.prev_row_load(prev_row_load),$ H& I* e; P+ E$ X, Z
.curr_row_load(curr_row_load),9 w" N q# W+ W
.next_row_load(next_row_load),
+ G l+ @: |, Y7 \0 A4 S8 r .result_row(result_row). O; n! K; `& C, K1 U' _& a
);
( t8 T+ r3 D0 M5 Y . r3 g7 N" d# I& K
addr_gen addr_gen(
, s x! H9 N( H) k .clk_i(clk_i),
# e e# w0 w& u- _& e# \2 K/ G6 n, k .dat_i(dat_i),+ L0 `7 ^% ?: W0 ~! q
.O_base_ce(O_base_ce),
D; ?$ `9 P& ^) \/ F$ Z .D_base_ce(D_base_ce),9 p. w) H8 v: Z# \0 h! k% f0 s
.O_offset_cnt_en(O_offset_cnt_en),
# k' x" B/ ?7 i. Q2 y .D_offset_cnt_en(D_offset_cnt_en),# T$ Z8 _5 m* B* i7 K
.offset_reset(offset_reset),' T$ F% v, t; I1 v8 w
.prev_row_load(prev_row_load),6 r* m6 t$ e! {$ F% [) H5 R
.curr_row_load(curr_row_load),
4 [" N w8 I4 R6 c$ K. B .next_row_load(next_row_load),7 B( p& H7 t4 k7 k
.adr_o(adr_o)
: G7 N" T( K8 l );0 r: d7 e4 V; l; ~2 D: j& |' C! X
machine machine(! u6 T4 a4 z6 ^
.clk_i(clk_i),
) l7 i/ `! r; v! B .rst_i(rst_i),# g- f( C n- }5 B
.ack_i(ack_i),3 _. R/ J" D3 K7 h! C. Z/ O
.start(start),, V6 \9 ]7 W6 p- H1 r( Y6 @
.offset_reset(offset_reset),7 {; E a! ]9 m$ V$ n# |
.O_offset_cnt_en(O_offset_cnt_en),
- ^5 ?' ?( G p- W .D_offset_cnt_en(D_offset_cnt_en),
7 V0 {, s* D3 p1 \ .prev_row_load(prev_row_load),
) H/ e6 k6 D+ c4 t; i .curr_row_load(curr_row_load),
8 r5 P% l2 v. L& d8 ~& M: ]9 g" d1 { .next_row_load(next_row_load),
# J5 b' ]) F' ~2 ?; W* U X .shift_en(shift_en),
5 Y6 I, l1 u; Z .cyc_o(cyc_o),. N' C! R- M ~4 H
.we_o(we_o),
, ]. {" @ ]- i" P/ W# W .stb_o(stb_o),& j- l! _& n' Q- s6 Z
.current_state(current_state),4 R' e; k) G0 {% b% e+ E% w8 J6 U- Y
.done_set(done_set)# B3 I0 g$ ]: U! {: p- B
);
o* b6 e% d6 osobel_slave sobel_slave(% T5 O) i5 p' g% O/ D
.clk_i(clk_i),9 h6 F* a: \; @
.rst_i(rst_i),# x& o& N: w% E) `
.dat_i(dat_i),( E) P. v& F7 k" C8 E! N: a
.dat_o(dat_o),
5 y5 G h+ d0 g0 r" g- ^ .cyc_i(cyc_i),
( g* J% z( |0 M% n5 r .stb_i(stb_i),
, Q& }4 [9 W7 e! D .we_i(we_i),# w/ {4 H* d v" |8 v/ \6 R
.adr_i(adr_i),
: F5 W! |3 X- Q3 a/ i$ ] .ack_o(ack_o),
- z2 ^3 m% f @+ e# U# T .start(start),* W# L% c) x# B4 R3 J3 s
.O_base_ce(O_base_ce),7 }6 D9 E4 W- d7 Q! H- V2 O+ D
.D_base_ce(D_base_ce),
; w& d! k S$ p% n: m .int_req(int_req),
C, _9 y. \1 j X' Q+ M .done_set(done_set),7 ?6 Q- m4 p) o2 |
.result_row(result_row)
+ R) ]1 {) T1 d( s, G- { d( G/ B );
% S" r r6 ~6 Y: |- ^. h) z. aendmodule
6 K, `' u0 g7 Q& a
! M: k% F( T" V6 naddr_gen.v4 a; F# K' t2 r' w, p* C6 a6 k% r: U$ k
, p5 R. `3 [0 ~; t4 }
0 _( Y7 L% j2 N& g; g`timescale 10ns/10ns: p4 Y) y+ x0 f, W# X
module addr_gen(
" \/ d; F0 P4 j/ ~ q //input
; S/ q/ y- J8 M/ m2 f4 d" t3 ^ clk_i,
n. j/ _: G" e. A: z( D# b1 E' ` dat_i,4 x* ]* V) r4 I& q
O_base_ce,# t3 I' K& h, v- U! B
D_base_ce,
( y; c1 y n% v* s4 c O_offset_cnt_en,9 b$ a! a$ K# y4 J& z% Q
D_offset_cnt_en,
, D! t. F1 C& f! ~" k offset_reset,' ^/ @+ n, f/ E6 B5 a$ N
prev_row_load," t) h' E; B! F9 h2 h. j
curr_row_load,4 z/ F( \- ?) F$ C
next_row_load,; B; [5 [+ {) p# H# Z, v
//output" E' t# E: L* d
adr_o$ L9 Z! Z8 \/ ~6 K( m" H4 G5 ~
);: ^/ b# f, y% a
6 y% ?' h7 U4 l: X3 Z/ a' `0 Oinput clk_i;' i- T7 l0 u/ X' |* T! A
input[31:0] dat_i;8 `+ B: F" q0 J' j
input O_base_ce;% ~9 y, i# {& f8 S3 L! K) j& {, t
input D_base_ce;
+ G' i+ h f. @6 X4 n7 Yinput O_offset_cnt_en;, a( P8 Q Z1 c- d$ I: z
input D_offset_cnt_en; 8 E4 _9 |4 i6 ?' u+ O5 ]
input offset_reset ; 8 M/ E$ H2 N* A' B" a
input prev_row_load;
! l0 o% R$ k Vinput curr_row_load;
Y; ^4 `: X2 M1 finput next_row_load; 7 o5 U$ R& _5 L# {$ p. t2 h
, d9 C$ _ p. ^1 x% ~* ^
output[21:0] adr_o ;
' \/ V( A6 g' }6 U, b& o. O' K' N
4 f+ u0 [# }8 n+ E5 p: Gparameter WIDTH = 1024;
3 S- k8 ~ }! k4 E3 [5 w, h
8 B8 q4 P4 E0 r/ n" ireg[19:0] O_base ;2 x4 h. e- j6 Z& C2 H& Q
reg[19:0] D_base ; ; Q" N) N$ l: C" I
reg[18:0] O_offset;
+ o% f0 |% x2 j4 c0 J' s2 i2 {reg[18:0] D_offset;
: ~/ P! e2 J) Q/ q- f3 twire[19:0] O_prev_addr;
4 }+ ?- j8 m9 l u' N7 Wwire[19:0] O_curr_addr;7 A) n7 R4 u) F! K* A
wire[19:0] O_next_addr; 5 X3 A; H) c0 E# M3 F
wire[19:0] D_addr;
) x$ [3 h) {5 d9 b4 B- A4 \
. H& _! x* n- j f/*******************************************************/$ F7 m- ~. r, x* L0 [6 p. G8 @
8 P! _9 x% e* o$ S4 W% c) v3 s
always @(posedge clk_i) //original data offset
4 a' Y: H& W# o6 a+ H5 M8 {6 h if(O_base_ce)
2 B0 |3 G/ Z9 K: k' C% g O_base <= dat_i[21:2];4 a+ K8 U) v! m/ M6 c) w
always @(posedge clk_i) //original data offset cnt
2 T8 X6 h5 x( b% H4 }1 @ if(offset_reset) : f+ {' n9 L8 w. t4 O0 T
O_offset <= 0;
/ v9 a3 q5 }1 W& ]2 t2 t) h$ ? else
" O* f# _# b$ b0 F if(O_offset_cnt_en) ' i4 f8 [2 X8 D
O_offset <= O_offset+1;
" Q# X3 Z8 ~& R; F 1 s$ `! O! h, F/ N( I3 k
/*******************************************************/
0 V1 { f* p- vassign O_prev_addr = O_base + O_offset;3 D9 M" ~9 A! r d5 ^
assign O_curr_addr = O_prev_addr + WIDTH/4;6 M8 C! N9 r, j, \* V
assign O_next_addr = O_prev_addr + 2*WIDTH/4;; [6 ]& y# Q/ |+ E- J" Z: w. K
/*******************************************************/
4 r( w3 K5 @: Qalways @(posedge clk_i) //destination data offset0 h: d% Z% E/ U6 t( k1 o' f) T4 k
if(D_base_ce)
7 ?4 Y4 d, L. S( {9 K( n D_base <=dat_i[21:2];
. a! Y2 H7 G" M7 o# Yalways @(posedge clk_i) //destination data offset cnt( p+ J8 w' B- Z; a
if(offset_reset)
8 t1 O2 u3 o' y7 j D_offset <= 0;
& m, D# U+ n# t+ B, I, J/ Q9 X else
3 O% j* M! C- I9 O0 T: g& k" | if(D_offset_cnt_en)
( o0 e; i/ v/ X. z, @ D_offset <= D_offset+1;
4 O9 Y: T6 L1 `- M/ k& M & ? q9 \' q6 m# w, A5 f
/*******************************************************// t$ F' r7 c7 ]4 I
assign D_addr = D_base +D_offset;
* O4 O2 S. ^, {& U Y/ l9 s/*******************************************************/+ B1 W: R3 W" q; t; G9 P. b* m
assign adr_o[21:2] = prev_row_load ? O_prev_addr :4 [4 x% z5 u! \! ]5 U
curr_row_load ? O_curr_addr :
4 U: j+ c7 E: p next_row_load ? O_next_addr :
P# }% g; L8 `3 N! Z2 O. Q D_addr;
A/ R3 A& u& i
7 i# A- K7 W, b6 @5 u/ F8 J( S9 Wassign adr_o[1:0] = 2'b00;
9 h5 y4 {8 w* O2 B! D! V9 c7 uendmodule
/ w0 b8 U: @, Y+ i% ^8 l" E# O8 C. Q0 Q+ Q2 m' |7 G) x7 P6 F
2 o! S) }7 F+ y0 S6 V$ g- _
compute.v
1 g) v& c! k$ M5 L: B2 p/ T' s! V! ?
6 R, ?/ S, J l$ m
2 o- s, X6 r5 q9 ~8 U! m8 s# _`timescale 1000ns/10ns
% s- N4 b1 E. `/ }* [: K
" P3 p; R0 M( I4 Y6 dmodule compute(# A' b* |4 C/ T$ A0 S) M" R' @6 X- G
//input0 @1 u& g- O" B. g! v
clk_i,5 M5 G+ g. x; |1 _6 Z4 m
dat_i,
2 p9 O! } m% C rst_i,
. _; t, P; F2 G4 g' |% a- ?: r shift_en,# _5 l }; y% n! _ }
prev_row_load,
4 }/ c% m% F+ P' i$ I curr_row_load,
; |$ I! T" ?4 ^ next_row_load,2 ]" O) H4 ?4 P8 a
//output+ D/ d8 }8 {4 s+ b0 ]$ a" W9 o! P
result_row$ d8 r, G: L/ u3 z9 M& K6 s
);
/ H# X; ~; l3 [/ K- L( E4 Z$ x
* X' v" V/ P8 G/ b- iinput clk_i;
/ \5 N9 i% P" m5 F* V, E" L- ^input rst_i;$ B% a- X" V- c% N6 G
input[31:0] dat_i;+ N! u+ s5 P. M# I' w9 `/ A
input shift_en; P3 y; [: I8 \
input prev_row_load;, l8 q! W7 h l' a% {( O6 B
input curr_row_load;$ l8 {2 c" R" P8 u
input next_row_load;
4 R) a0 Q$ p0 Dwire [7:0] D1 ;- D- N8 `4 h# Y9 o2 B, z
output[31:0] result_row ; 7 }+ p+ C, }9 t0 n/ v
! j% k3 A7 K5 H4 | v
2 R% T! d$ V8 p6 b/ {reg [31:0] prev_row=0, curr_row=0, next_row=0;
6 p/ D+ A$ d1 ?1 i* b+ h' Mreg [7:0] O[-1:1][-1:1];# \# L* |" k# R: A
( I# o; U1 E0 \2 e
reg signed [10:0] Dx=0, Dy=0;
4 c% M+ o. H; t8 T- z
7 Z% C* [3 Z* t$ I) G8 _reg [7:0] abs_D = 0 ;
9 Z0 L! w" E0 x8 L2 preg [31:0] result_row =0 ;2 d' l/ r* z$ D b
- @" e6 s! N0 b9 [ 2 x- [# D! h0 x0 [# C: Y
always@(posedge clk_i)3 r i& o+ g8 p7 F7 D8 s3 ~3 ?1 W
if(prev_row_load) 6 J) D! u3 u' T3 e
prev_row <= dat_i;1 M9 g n$ A& h0 P3 ~; `# ?$ |1 r. z
else
@* H7 N( |/ j" g if(shift_en) % \, d: ~" ]$ Y$ H W3 S4 l9 j
prev_row[31:8] <= prev_row[23:0];. x; \/ u& Z3 ]7 O9 X& ?0 k
0 E4 ?# u$ }2 `/ h! walways@(posedge clk_i)
0 q( K% N" E( @' ` if(curr_row_load) : }1 s0 m ~& m9 [; q6 B
curr_row<= dat_i;- U3 h- ^! B( X5 g5 V* f
else 4 f7 p, t4 Q3 B0 I6 @; b
if(shift_en ) + _, i7 Y0 q9 N+ r
curr_row [31:8]<=curr_row[23:0];9 G7 T! {1 D6 Z9 ~- }: z0 b8 j
) G# y9 j+ [# Zalways@(posedge clk_i)1 I) g5 g# z7 A) ~
if(next_row_load) o: V/ e4 x: Z( w( A
next_row<=dat_i;" c. B% C/ r7 d0 E3 Y. C
else
# B0 S% p9 {7 z6 i7 S+ X5 O& f: ] if(shift_en ) 8 h! O( ]* D P. Z% |7 t5 y" r
next_row [31:8]<=next_row[23:0];& H5 C$ M4 j: R. d- h& B
. v9 ^! N# @" X8 m( [: U9 \, S+ p5 E
" C1 ?2 o- b; i7 F5 Hfunction [10:0] abs ( input signed [10:0] x);/ ^# `0 N3 o5 }
abs = x >=0 ? x : -x ;+ }& X# v6 ]; c0 h8 E$ j( ^
endfunction+ p, s) G6 Q" n H
6 d$ p2 _' Q/ X$ z1 R" d5 l //comput pipeline
! g. p$ i6 n) ~ `: G& B0 A; w : u+ ~8 e3 x W+ w2 z
always @(posedge clk_i) / m8 C4 o1 ]8 F; B) F/ f$ q" ]' ~& D
if(rst_i)
1 L" O2 a. }' D0 } begin
6 n4 N% Y0 V" g* A$ a, d O[-1][-1]<=0;& l) Q/ m( F6 s* K' j+ }
O[-1][ 0]<=0;
1 X, [& y& w- r4 {& n O[-1][+1]<=0;$ c) [8 Q- E$ S# t9 N
O[ 0][-1]<=0;8 n4 j& z2 f5 e+ l* f6 R
O[ 0][ 0]<=0;& ?$ |+ r; L7 T
O[ 0][+1]<=0;
- @7 B* O% T* }( _. o1 Z O[+1][-1]<=0;# w8 ~+ \3 Y" @1 d" S
O[+1][ 0]<=0;
( v8 K% j+ i/ z O[+1][+1]<=0;4 s9 ?4 S- K! ]& A3 u
end
6 h; F2 S4 t3 [/ d, z0 m7 |1 e else0 ]; v8 e$ W: {2 S7 r
if ( shift_en )
: q( a0 l1 G/ ~% \6 `6 M begin; x e N+ m' \
abs_D <= (abs(Dx) + abs(Dy))>>3 ;
, g0 I8 }3 L3 c" t
5 T5 C2 M4 | _, ~$ ] Dx <= -$signed({3'b000, O[-1][-1]}) //-1* O[-1][-1]7 I: K: |* H) i* e0 F3 n% `8 o
+$signed({3'b000, O[-1][+1]}) //+1* O[-1][+1]5 T1 s8 y, \6 _" a$ ~
-($signed({3'b000, O[ 0][-1]}) //-2* O[ 0][-1]# x- W2 k) d3 W0 h" d+ l
<<1)4 \% G' @3 a! t: v4 h) O0 w
& n. H1 B% X; ^
+($signed({3'b000, O[ 0][+1]}) //+2* O[ 0][+1]
( O. c. w2 J6 a) D: n; `1 v <<1)% l& \" p6 s6 Q7 H2 G
; y% [$ a6 W* N, K! y+ v: h -$signed({3'b000, O[+1][-1]}) //-1* O[+1][-1]
3 X: y4 @$ v- g$ h3 |) t9 ~ +$signed({3'b000, O[+1][+1]}); //+1* O[+1][+1]% c) X' ]) o1 O" w: Y: s5 m
3 N E! I1 H H6 ~
2 G. v' A- {1 f$ K t1 F
0 h$ d& j4 x- ]- m* I, x3 e Dy <= $signed({3'b000, O[-1][-1]}) //+1* O[-1][-1]
' l3 z/ R/ A$ ]8 g +($signed({3'b000, O[-1][ 0]}) //+2* O[-1][0]8 W/ s V! H8 ?1 e
<<1)
7 \! ], O/ X/ h7 G; a+ z7 O& m: A , p- `/ h- s4 G p9 c- E3 F9 y
+$signed({3'b000, O[-1][+1]}) //+1* O[-1][+1]9 Q- ]% |/ f4 D1 D: @$ Z
-$signed({3'b000, O[+1][-1]}) //-1* O[+1][-1]. L3 T' ~* m+ U7 F
-($signed({3'b000, O[+1][ 0]}) //-2* O[+1][ 0]
7 `; J1 q' V+ N/ t6 `" q <<1)
2 i9 C# P% g0 F# i
& X6 W, k% z( D -$signed({3'b000, O[+1][+1]}); //-1* O[+1][+1]
7 t- }/ Y' g9 u. [" w4 W
( [1 e+ m& Z8 D5 o$ _7 i, ` O[-1][-1] <= O[-1][0];1 l- q* G+ y' A ~
O[-1][ 0] <= O[-1][+1];( @7 @/ A8 x" ?; i
O[-1][+1] <= prev_row[31:24];
, j+ Q: x E' t O[ 0][-1] <= O[0][0];( X( S. f# K' K; z1 z
O[ 0][ 0] <= O[0][+1];) I# {, n; T6 w
O[ 0][+1] <= curr_row[31:24];
t T2 r- M t, F. b6 C O[+1][-1] <= O[+1][0];
/ V9 h8 u1 t+ |; L: e1 B4 { O[+1][ 0] <= O[+1][+1];
1 Z% S f# S9 u" F O[+1][+1] <= next_row[31:24];
# _* A2 I0 F. Bend
& d+ s8 }7 Z2 ^" V4 M4 i/***********************��ֵ����***********************/' h R$ A$ i: @* {% q8 Z
//assign D1 = abs_D < 60 ? 0 : abs_D ;(����ijЩͼ�����б�Ҫʱ�����ã���ֵ60�ǿ��Ը��ĵģ��������ͼ��������)
+ o* p) _: c1 G/ J4 ]+ H/*****************************************************/
; ?5 D. W4 g- r7 U5 W( Valways @(posedge clk_i)
. P, j* l, o L: `6 r if(shift_en) & O9 }& X. c9 S' E& W
result_row <= { result_row[23:0],/*D1*/abs_D};
! A/ \9 p& B& D, t3 j _+ Sendmodule& x' z- S! s. r; ?6 e
P( [9 L+ ^$ I
. n! X& E w# H, O4 rmachine.v
+ j3 B0 @+ [# J/ W% W. ~3 t
8 U+ E# E" s7 G" i3 r% W4 R
7 j1 l/ E# _$ ~5 t2 } E6 `( u9 Q
. E% P* j( H e! g$ s' m* i3 a/ P`timescale 10ns/10ns4 l# M- @( e& ~; c
/*****************FSM***********************/
1 H7 }7 E4 d5 ^! ~module machine($ d3 _8 G4 ?0 M
//input
& @8 ` t; K5 V$ q& s0 W; P7 a7 | clk_i,$ x/ Q+ h/ o1 u- V0 b7 R
rst_i,
3 |" A, \# m' _2 j, A& Q- _ ack_i,/ l+ x+ e8 Z* n
start,
9 x; C9 ~/ T# [" q( J, d //output& V8 v$ j* {- o/ |8 k; d Z
offset_reset,
9 t/ t0 \6 T l3 t' Z6 P O_offset_cnt_en,/ p- j2 a/ V/ J) i; h
D_offset_cnt_en,; s( x5 j4 h) l! e) Z# Y
prev_row_load,: `7 @) y2 }$ y) b. o
curr_row_load,6 a; L# w6 Q- Z+ F+ W8 y9 r
next_row_load,4 Y/ }7 t _( b
shift_en,! W L0 m) K; Z: A' p
cyc_o,) @* Z* \6 X3 \3 f' j6 I
we_o,4 ` B: @& V2 T2 b' J% [ o$ H
stb_o,8 V! r0 Q# Y5 [' K1 ?7 D% e
current_state,& h) `: N9 ?* y b- N
done_set& L! q8 h5 b6 b
);
. R3 M6 U$ k4 w, Z$ ?input clk_i;2 k* _! C* }# x' N4 u& f) r$ c
input rst_i;
6 N3 d$ W+ i# f7 l7 T7 ~input ack_i;; H" X6 m0 Z3 |) v" ^
input start;; Z! B$ {4 ~1 j. t
output offset_reset;, R9 @9 w: O3 O. I8 A6 W
output O_offset_cnt_en;
8 e' i- s1 m2 q0 t6 v1 @. R# |output D_offset_cnt_en;& K D. O) l m" V
output prev_row_load;
8 d5 h, d" E k0 A' J$ X" _output curr_row_load; 5 M9 |7 X" G, I6 ~! C! Y" j
output next_row_load;
; u h. E) H$ \( v4 N$ ooutput shift_en;
1 ]6 ~ S q& o" o/ g/ Koutput cyc_o;
8 z7 p" ]' s5 ^$ @% y1 B; soutput we_o;
* z8 {8 }6 k. houtput stb_o;
! l4 ?# C4 A9 Uoutput[4:0] current_state;: B; `" P9 K7 Y' y
output done_set;
7 L5 i) N* W) ~( {3 e
$ k4 F; k- t9 Gparameter WIDTH = 1024; ! @$ T- O( O* f. V. V3 M) }
parameter HEIGHT = 768; * \7 H x5 t' U: Y8 K
2 E' s5 u; b9 y% M, E! N; p
parameter [4:0] idle =5'b00000,
. Q7 h' M; Y' p7 U read_prev_0 =5'b00001,
7 F7 N' I9 D) c9 l( {9 y read_curr_0 =5'b00010,8 Q7 e$ t3 C3 j+ h- d. G
read_next_0 =5'b00011," t; X8 d. k5 U; k! {0 v
comp1_0 =5'b00100,8 F# \! X( {# a* N) D
comp2_0 =5'b00101,
6 X! ^3 W* v7 _ comp3_0 =5'b00110,
9 f6 z- N, w$ M# `- }4 Z comp4_0 =5'b00111,0 g+ w0 P: k/ ^; v3 d3 H9 l5 c' C
read_prev =5'b01000,
" k2 [. z* z+ r* N read_curr =5'b01001,
- y( R5 b. ?# L# q& L: d, a: h read_next =5'b01010,. C4 F' l. ~: [7 x# U3 |. I4 ^
comp1 =5'b01011,. U. W* b. w; K
comp2 =5'b01100,
0 D- A' H5 Z! E) S/ E9 R comp3 =5'b01101,4 U) W& W/ f2 g/ W, ~
comp4 =5'b01110,% x: x w- H( R0 w- i# }
write_result =5'b01111,
7 m' c( {4 d- N+ O write_158 =5'b10000,
: v" Y% }8 L( }& w+ Y comp1_159 =5'b10001,
! E( }2 E% Z' D5 a: m* F a) i( b$ ? comp2_159 =5'b10010,$ k6 M, X6 _! x, c1 y
comp3_159 =5'b10011," U0 I x/ |- G7 [( t
comp4_159 =5'b10100,) ?8 R/ ?) G, ^
write_159 =5'b10101;' C) K, G) D# [1 A Q% k( m
( o L7 W1 n1 w' w( e* F% s9 nreg [4:0] current_state,next_state;
# j$ e1 q$ } f6 |7 |" c. g8 O! creg [10:0] row; //0 ~ 477;, I; e q( `: W# S" {
reg [8:0] col; //0 ~ 159;' D/ L" y! D& |) _& G7 I2 r
6 B( F$ W1 {' X0 h8 M( d/ d. ~wire start;( c: b0 @. u" ]1 b0 v4 v
wire stb_o;+ {9 L- d! n. ^- F* d! H
reg offset_reset,row_reset,col_reset;
, l) B7 |8 B6 E/ Q8 g- b4 S/ J 2 C+ F& `/ r2 l
reg prev_row_load, curr_row_load, next_row_load;) u" J) l' r+ l9 f! Y3 k+ c5 w
reg shift_en;' N& z7 N" b* h, p' b5 d! L7 X
reg cyc_o,we_o;
" t. p' E, C& x9 l/ Q1 }reg row_cnt_en, col_cnt_en;
7 D6 s7 h' g9 E+ _$ c3 |" jreg O_offset_cnt_en, D_offset_cnt_en;" k! z% o6 ~: b5 n" ?3 I9 T/ s
reg int_en, done_set, done;
* W- T- S% g& z0 S+ t7 f7 Y
$ W! g4 H& ~4 Z, p& R' {# ^5 h
$ M7 q2 r( L4 M! e& H* v Y3 e & ~3 r5 j5 o. O9 f( u! x
always @(posedge clk_i) //Row counter
, s R. j! q$ B if (row_reset) 6 w, q" d1 |, Z2 I/ C
row <= 0; g: P1 e# R2 N( c
else
, C# h# @, [, \' U if (row_cnt_en)
' X1 L6 }$ `1 Z: V8 ~5 R row <= row + 1;
4 \8 T" H! p" v% ?* @0 ^3 l3 c
% ~7 l: U1 Z( w* ]always @(posedge clk_i) //Column counter
0 j5 b( q6 R7 S1 w5 a if (col_reset) ; b7 a9 Y$ C$ e& ?" ^: O
col <= 0;: G% j3 l& D& `5 u- O1 `
else
. a8 N$ N' k6 Q$ ~5 S9 ] if (col_cnt_en)
8 ^9 F( k8 G, S% s4 m% z col<= col+1;
# Q5 N! L. ?* c( H/ I/ ?, M / {1 I' B/ V$ @1 Z0 \
//fsm
4 |8 Q0 U: c5 }1 f: palways @(posedge clk_i) //State register9 m% `' i. _) `7 x3 z$ j
if (rst_i) 8 K+ O# v( V* ~3 ]$ t( h
current_state<= idle;0 }: n' N; S( T( I( \7 v* W2 j
else ! L! a( V2 Q" k' R4 B6 a3 O
current_state<= next_state;
8 N Y+ P0 D$ k, H/ _ 7 ~ m, U# X4 p. `* N
always @* " E2 y. y4 l) T- }
begin / R P' s% C% i! ^9 b% @
offset_reset =1'b0;
1 l$ g6 B3 K3 P+ ?( j9 T row_reset =1'b0;( m# e5 L- Z$ x2 V6 P( T- q! H
col_reset =1'b0;
1 p3 U' O0 ^9 R1 t8 I5 r row_cnt_en =1'b0;
7 e# v( s' e1 i8 u% Y' M/ U8 L+ U col_cnt_en =1'b0;) l# t/ z* h2 J1 V
O_offset_cnt_en =1'b0; ' }2 P, k6 K5 B8 f( G% v' {
D_offset_cnt_en =1'b0;/ b7 m$ f/ I: R
prev_row_load =1'b0; # K! e3 w6 H9 N7 \
curr_row_load =1'b0;
& R, U# f* {4 }3 n3 l3 a0 F0 q next_row_load =1'b0;
- m& n1 o2 X, y# w% o: t shift_en =1'b0;
4 N: g% M# q d6 a# ^ cyc_o =1'b0;9 a' H2 o* X8 v
we_o =1'b0;
1 C) I: G, n) h: A" h done_set =1'b0;# p$ D6 P; G9 ?" g) Q- S
) O: W) n2 M* w% \5 @0 q9 H/ |
case (current_state)
5 A; W" G+ o4 t6 F# g; i7 A! ^idle: begin% P# a# q! t }2 z6 F" w
cyc_o =1'b0;" l9 |% e7 P- S4 s3 D, D3 x
we_o =1'b0;8 @* C P! N9 B+ J
done_set=1'b0;
$ ^7 T# G$ T+ M+ G D_offset_cnt_en =1'b0;
7 J7 G$ q7 x. p# J offset_reset =1'b1; # f% T5 h. k( d% `) J# i
row_reset =1'b1;
8 E& ^; t. i3 [, E col_reset = 1'b1;
! @# T& i& J! B, I4 ]5 f if (start) 4 x; n/ Y5 r! F: Y! X8 q
next_state = read_prev_0;
) b; L( x. f, k! b4 } else 9 P$ {% k1 @$ e8 S) J$ l
next_state = idle;, W/ T* D& `' O; i) { M
end$ X0 X$ B* t. T% X! s# ?
6 y/ @, S% r3 o; ]* j$ Y/ q# z/*************************************************************/
) k, p1 `0 }) |# {: a# c! pread_prev_0: begin
( f9 W0 y# A' i4 l! j) a5 ~) T1 s offset_reset =1'b0;! |: M# B( N: b/ m. Q
row_reset =1'b0;
: b) q* m: E+ V we_o =1'b0;. j2 k ~6 w3 w" W1 ~" z% x ?
row_cnt_en =1'b0;
5 f- K- X1 j/ s8 W% } D_offset_cnt_en =1'b0;; ?- L4 A+ @& M# M$ b
col_reset = 1'b1; & N$ w" Q0 e% G' L
prev_row_load = 1'b1;6 ]+ B" |6 U7 b5 _# n& e
cyc_o = 1'b1;9 i9 O" u5 Y! f0 F |
if (ack_i)
7 @* M5 d, t; R' N T next_state = read_curr_0;$ P& p" S, {+ `# j: ]
else
3 A) j( ]5 ]8 h% Z4 ] next_state = read_prev_0;
; S) v2 i1 t5 U end' @4 m# n) T; O: ^
/***********************************************************************/ ) R4 ~3 t% k) ^/ q% U; m# x
read_curr_0: begin
% F! i6 _+ ]0 {! z! \; r8 J: v9 @- y col_reset = 1'b0;
+ U1 S" \$ k+ K1 W; a n- V6 A prev_row_load = 1'b0; 8 g* U- P# C/ L2 ]& x3 J9 y
curr_row_load =1'b1;
; E7 V3 [2 d& s6 J! U* h cyc_o =1'b1;
) [; ]- S: f7 F" R2 [# r if (ack_i) //read mem
5 U Z- P/ o# J6 l next_state = read_next_0;
+ V9 Z! `& C" j2 I+ e J: f3 G else
* F1 i% u& f- K+ u* X" Z3 E next_state = read_curr_0;
- K( _9 ~9 o. u; x! i# i end5 f7 u/ u r5 {" i3 L
/*********************************************************************/
2 {1 w& n( T: P" S' n6 a$ Hread_next_0: begin. n0 { h8 s4 D. E' X
curr_row_load =1'b0;
. }8 s( m: m% U, R+ e, ] next_row_load =1'b1; $ ^5 ?/ d$ S* H8 C/ _4 }
cyc_o =1'b1;; C+ W& C, v8 E
if (ack_i) //read mem��cnt+=1
$ J5 D7 }& O4 d: }0 g$ ^- t begin
9 z7 b( C2 q4 D) O$ g& ~ O_offset_cnt_en =1'b1;
+ s+ {/ q. X, V: a3 P8 ^$ [% F next_state =comp1_0;$ x, W T, K; g
end6 r8 B* d; C n* l: \( j
else
/ e3 \/ \7 P( n3 }! u$ L next_state = read_next_0;1 v: S X3 s [3 N# o
end6 K3 K6 X( Y0 n- i( }" B0 f
/********************************************************************/
- h _- R# m0 q/ d* ncomp1_0: begin8 s6 J( Y+ r2 W, B
next_row_load =1'b0;
! m2 a. ?( k5 E, \3 C& m2 J, R cyc_o =1'b0;# ^1 W. f. Q! o5 F. Q1 D
O_offset_cnt_en =1'b0;
2 d& U/ b0 u \- j: V$ S shift_en =1'b1;) Y+ {' r. s* a- W! W
next_state =comp2_0;' ]- |: ^# P. T, t# N
end
b8 V7 Y( Q& Y3 zcomp2_0: begin
7 {( z/ ?. V6 c+ L/ A8 z shift_en =1'b1;! K! z/ a2 _& ^$ L$ x, }
next_state =comp3_0;3 y& o! O* q9 {0 \( R7 P( T. X
end2 ~6 k u ^5 w$ i% i2 K U+ c0 s* o
comp3_0: begin
' t' `/ V! O; I7 P shift_en =1'b1;( b4 \. N8 f5 W
next_state =comp4_0;
4 u( R f2 k3 u end
- m: y6 @0 d) i& scomp4_0: begin' V/ Z& X3 b/ [3 @) g/ Q, {
shift_en =1'b1;+ Q, ~' r: K- E7 g+ m! {% X% u/ c
next_state =read_prev;' C7 t, n2 t7 s8 L, R8 Y
end
0 f3 ~+ b+ I6 e' W, h) |: P' B/ |/**************************************************************/0 r3 c1 t. r& S/ s- {2 d
read_prev: begin
; M0 r" V# Q9 i1 |! G shift_en =1'b0;
% ]; r' j& S- z we_o =1'b0;
7 q9 G! ?3 H" z/ r col_cnt_en =1'b0;
, p h2 N* @. l: ]# |( b6 b D_offset_cnt_en =1'b0;% X' Z) k: j! [, u$ `9 i
prev_row_load = 1'b1;' a7 x* s4 o4 g
cyc_o = 1'b1;8 J$ F/ | L! m& p+ J) G
if (ack_i) * K' r( U8 \5 h, N4 m* T8 d" W
next_state = read_curr;1 K- w2 b; M$ Q- u* c+ W# Y
else
6 Y$ ?# I( U* c4 x next_state = read_prev;' }8 M; P2 D: S/ V8 H6 H
end 7 o- r! E0 t- X. o" p1 {, _5 f
read_curr: begin 3 @9 K6 H( V7 F7 E" d
prev_row_load = 1'b0;
( |% Y9 k. o! k" C0 _6 H& J; L' V curr_row_load = 1'b1;
. |* Y7 W6 s h cyc_o = 1'b1;
+ a" [ y( z) X if (ack_i) 7 f: y: m% G0 J2 ]5 r* l7 F. x
next_state = read_next;3 X w: q7 q. Z! G i* i6 m. h$ Q
else
* x2 A! f8 h, W# [; ^, i next_state = read_curr; q, P" V, L/ F o3 \# U
end 1 q, W/ [8 c" Y2 ~. B; n
read_next: begin5 `. n4 Y/ T) N2 `9 e
curr_row_load = 1'b0; ( F. x! w" ?+ a
next_row_load =1'b1;
" Z- k N1 [$ @% Q | cyc_o = 1'b1;' M0 _$ ]: V' e. |" |, H/ L! W
if (ack_i)
% A3 g* b; N( Z2 h' I0 S2 \ begin! [3 {6 T" k. l
O_offset_cnt_en =1'b1;
+ ~7 g+ Z; k& k5 S4 J" e5 l) | next_state = comp1;7 h# p4 y I2 ^+ c: D) v- |
end9 N ^, s! S5 T" R/ x ?" |" I/ j9 J
else 4 P+ J' Z o5 _4 ?
next_state = read_next;# e; a1 E& Z$ t' z$ E1 f
end$ K) ^& {: T( v
/************************************************************/ 7 n+ e: u( W4 S- A ]/ I, V' M
comp1: begin
2 y% X& F- Q8 I: z& m* I next_row_load =1'b0;
# M% P4 {# ~# c8 J: `& x0 U O_offset_cnt_en =1'b0; ; P& [6 t. t- O. E5 }
cyc_o =1'b0;$ Y. i3 o, Z& Q7 P- Z; f8 H
shift_en =1'b1;
' u7 ^2 `" x, ~* G+ Q2 l" c next_state =comp2;
7 ^+ Y, i- y* f end
, r. f2 I2 t& a5 ~+ P* m% Wcomp2: begin
3 {5 R3 B, v! Z( k- T+ K5 Z0 q- l( b shift_en =1'b1;
! e3 O- c+ x1 C' r next_state =comp3;4 {" {. d q8 u3 P
end @# o" \/ M8 y+ N: |
comp3: begin7 M# d( O0 s$ Z% R5 K3 [) X
shift_en =1'b1;' F# z1 J" O& [
next_state =comp4;$ H ~0 n! [, E/ I y
end $ k0 W( ]& E. i/ M
comp4: begin7 q! t5 X) m7 m9 J
shift_en =1'b1;" X. V, }- \. U. U. @* L5 J
if (col ==(WIDTH/4-2)) - ~( \5 @1 G" m
next_state = write_158;
; P3 J: F2 Z. _ else
" U9 H) T8 ^" \7 z next_state = write_result;+ S0 f# X. F- ?% E1 I/ v- Z) i
end/ Q; n) b5 r4 ^, T9 N7 M0 U; Z
/********************************************************/
+ e/ n- M, B+ dwrite_result: begin
8 q; l, {3 y$ V0 u# p' e% n shift_en =1'b0;( O Y; T+ s0 ]7 f
cyc_o =1'b1; " |8 \2 j# p* d7 u1 l% o" O) o/ e* D
we_o =1'b1;
( A2 B2 h* U8 V. D8 {/ u if(ack_i) $ Z& M0 P! Z& E% j
begin
, A$ u. D0 B5 [ col_cnt_en =1'b1;
6 W- P h$ [0 S$ G6 h D_offset_cnt_en =1'b1;5 _: s Q( M( t2 X- R5 w5 _* P! l
next_state =read_prev;
" `$ R r* R D: j( {/ Q end' d$ l" u! t, e3 d9 j, v, t) n
else , m, f2 P2 W, G5 X/ l
next_state = write_result;! x# i' a' Z) f/ L- s
end
/ l4 _/ n; _2 {* S7 d! fwrite_158: begin( u. Z/ X) Q; N( ^$ k/ B
shift_en =1'b0;0 c7 o! n2 L I4 k% b$ l' m
cyc_o =1'b1; . ]2 Y* i* }) d7 {
we_o =1'b1;2 Q9 b$ n, W& m) w3 M9 ^9 a5 _8 N
if(ack_i) 5 h4 x* F: g8 A, H. S; k
begin) J) w: q U% {, \
col_cnt_en =1'b1; 8 l! s) v3 T1 M' r) B
D_offset_cnt_en =1'b1;
# h' f' i$ o1 U next_state =comp1_159;
P# h" q8 a% C* M4 x4 h- z end
+ @: U0 Q, x& h7 o- V else
6 t0 s( m/ R: l& F next_state =write_158;
- b' W" z. i: o% e& L; P- u- A end" r4 Y- D5 e! }+ X) D
/***************************************************************/& {& O) e$ p4 `( y
comp1_159: begin //pipeline output stage3 I* g* U* B1 I- e( F- I* d
col_cnt_en =1'b0;8 e% o, i7 Y) p2 o
D_offset_cnt_en =1'b0; & B" j0 G' X4 Y" \- ~
cyc_o =1'b0;
8 h4 a4 Q: x$ E: [, P we_o =1'b0;0 W; Q# P/ I2 ?- e2 a2 T7 S3 I
shift_en =1'b1;
! Q3 C" v! y/ T& ~) d `! @! |2 c next_state =comp2_159;7 C5 }) j8 U4 ^8 ]
end
# U% R* L# f% hcomp2_159: begin% J( M, N5 q+ R
shift_en =1'b1;1 Q# f ]! f: N1 q) w1 w/ Y5 p
next_state =comp3_159;0 V' U4 W0 U/ q+ S" V+ s1 S3 v* P' S6 N
end 3 V+ t4 g) |* P! w6 d, ?
comp3_159: begin0 f W6 L9 I5 l0 s
shift_en =1'b1;
5 y% \: B/ M5 t next_state =comp4_159;
6 ?7 E# J- ?" J end + X: f; P- D( G2 ` j
comp4_159: begin$ h- R5 A& S q: r7 T4 M
shift_en =1'b1;& ^7 Q, o, M6 \; }/ d0 `5 X) @9 N
next_state =write_159;5 ^! i$ w! f- ?3 V/ ^
end 3 ?; Q; i5 I0 P4 X
write_159: begin
* A0 l# c; J' j7 [0 w+ H shift_en =1'b0;7 n- M7 F9 W3 f/ f7 D3 J8 s. i
cyc_o =1'b1;" K8 M9 f3 g9 ^2 P7 x+ {# n }$ J
we_o =1'b1;, f$ I! {6 H' D( O" V3 p! w
if(ack_i) , N; {6 H s& H1 `* h1 `% R
begin
; W' _: `! S8 J, X3 K5 |" E D_offset_cnt_en =1'b1;- \; y) Y. k7 U8 c9 V
if (row == HEIGHT-3) //sobel done
2 E' U4 ?7 s6 h' V7 A7 n7 w begin/ d) }5 U" ]3 M& B) Z
done_set =1'b1;
4 G( O% A% v9 O- x9 Z; V5 e% l next_state = idle;( H. K% A. ~$ c y
end, K* Q) L9 ~, p( b
else $ ]' Q0 ~. W4 j1 O6 I7 T$ A
begin: R: v! Z; n1 G2 a5 V7 C' v- y3 f
row_cnt_en =1'b1;
: ~' u4 Q3 n% G next_state =read_prev_0;
- k: Q% F L9 a" P9 c' e+ C. y w8 O end& o7 @+ e5 `" V" G) T+ x, p
end
+ ^& W V+ \8 b i; P4 A else - |* u' [* R1 [& F g
next_state = write_159;
; C$ j# a* t0 u$ j+ t( V end
5 k B0 Y8 x+ W e1 | endcase
7 E/ f% o1 V8 b. H) F0 v0 vend
& U; |6 Z+ h( Q6 a" @/*******************************************************************/
6 V! Z$ k/ n/ c0 q1 V# M0 w4 z3 G
7 t" [; K3 Z' Y" N& y$ {$ A0 sassign stb_o = cyc_o;
0 K8 W8 C) o/ A6 d8 @4 X6 s
3 C7 r/ s* g0 V8 a; pendmodule
, r6 T! R$ P9 y* `8 H4 |1 t' f6 ?. z8 ]3 W1 m
9 b4 j+ x- _7 J' N# y4 h4 T0 M& z) F
sobel_slave.v
v& G4 r$ H+ {: h7 S r( K. W* r! X' x
2 o' |* `. y- C+ R; N
1 t5 f+ D) Q) s$ Z2 }7 T
`timescale 10ns/10ns& Q* y0 ] I& M- k# G, O0 P2 U
module sobel_slave(- K( y: t3 _0 g* {( Q9 y) c
//master slave share i/f
% U6 H) \, G9 ~ clk_i,
0 h$ j$ R3 L0 B rst_i,- H9 p, o# Y+ d' x( e! \! Z
dat_i,
h; p$ w! G5 y! J$ Z$ Z0 S dat_o,
3 l5 E \' b- `. r1 a) r( k% p; L //slave i/f
! B7 Q! r, _! g6 O% T' r cyc_i,
5 O& U" q9 l: X, d' E4 E stb_i," b" |* u/ v% Q- v! O& n: t" G
we_i,
6 m5 X% Q9 H# R; U0 q, Y# ` adr_i,2 k T; i2 H W5 Y+ Q
ack_o,- g1 ~, P; G1 x+ G% q/ O& R! H
//output
0 f# `! l4 E: @/ x start,
5 N8 B9 A- E* F O_base_ce,; w! W+ Q0 Z" d: o' {* ^
D_base_ce,
; h1 v, B5 w! Y 3 N: N) @# l( {' o! {
int_req,
# _1 [# |6 c! N! U5 p : d' J1 v; ^1 |4 w( l
//input* o$ q) h* V" t& Y3 q4 n
done_set,8 e ~ ~2 g# [2 c. d0 b1 j n& B
result_row
- _" o9 k4 v7 O& x3 | );
' B6 j1 `6 p+ W+ d
0 {: I. ?; B- p% G0 u( g$ rinput clk_i;7 {4 M$ Q, y/ f( P# \, ?6 k
input rst_i;
9 q5 Q; Z- \8 _input[31:0] dat_i;' b$ S- Z( N2 B, P6 O
output[31:0] dat_o;
3 q) j3 m6 }* ^* X. Y
8 J# J- I2 c' z d$ i, Dinput cyc_i;
" @' e! N: b+ jinput stb_i;
" m: ^# I7 p, @0 B/ H; xinput we_i;9 N' C, p" ^1 ~7 `6 }. M/ h
input[21:0] adr_i;# Z' `2 v& |1 Z/ w
output ack_o;! M, o4 U) h& L
output start;
% g0 K/ ^( r: y/ B. F9 ooutput O_base_ce; D# G. @& Q0 H5 D/ V, ^
output D_base_ce;) C6 j" o& g7 h8 _8 R! l, V+ l
output int_req;7 |, A9 ]8 ^$ Z1 ], U
input done_set;1 V9 r y' ]1 N% ~
input[31:0] result_row;
# n. _! _0 s; w& ^/ L6 F ; o2 }. X+ g- U1 D- X/ J
reg int_en;
' ^/ F- t5 Z& mreg done;% j% A3 n+ Q) l' i* D0 `* H! u
reg ack_o;5 r; Y/ h% V5 M' @+ w1 s" ~1 Z
reg[31:0] dat_o ;
* e& |6 I' k% {2 S8 n( G
- U- e* E, t+ X8 k/*****************************************************************/
v$ ?+ F0 [0 V7 k. O) l ' j$ H# \: A9 B2 U- O
assign start = cyc_i && stb_i && we_i && adr_i[3:2] ==2'b01;//adr_i[3:2]* d7 s! }5 w8 I' X5 N, u4 a
9 |/ T. F. L3 y, P" F( B! |" p- C: L
assign O_base_ce = cyc_i && stb_i && we_i && adr_i[3:2] ==2'b10;
' O. G! l+ Z8 A: r2 ?: Q9 g
" f6 W% x: ?9 c" x) {/ g \assign D_base_ce = cyc_i && stb_i && we_i && adr_i[3:2] ==2'b11;
: T; z: G* C6 d* z+ i* U( Z 7 Y9 b) \, a, K, }' c9 O4 I% w" W2 _; @
/*************************************************************/
) M9 i% {/ L; q4 Q# i4 s
8 v9 F5 S& p* b* v. w# X. h// Wishbone slave i/f) I- i$ {+ Q" o$ b+ h
1 ?1 R( |$ A. @9 ]# T- F1 d. u
always @(posedge clk_i)//interrupt reg
- X2 M/ @; Y8 p2 a8 e* A4 A if (rst_i)
- |3 |* a9 J i+ d" k" V7 L0 R7 x int_en <= 1'b0;+ n7 K I* \1 K: d' a) U
else
* A/ [: c3 y1 |6 S0 w if (cyc_i && stb_i && we_i && adr_i[3:2] ==2'b00)
# q; w* S# F3 X" @" g int_en <= dat_i[0];
+ X" B' E" a7 e/*****************************************************************************/
+ Q6 G0 _, X, t3 ^+ N* d+ p6 P
1 _ \6 |1 O) J& L& d2 s' {9 walways @(posedge clk_i) //status reg) P' M4 [( }; ~/ r
if (rst_i)9 s' A D9 y+ t5 l; ^) ?* I
done <=1'b0; b0 R: k- K: h: J6 x2 L n& u& g, f) M) M
else ; L6 z4 z1 p8 U% x+ Z2 S/ `
if (done_set) i2 p* e! g( R
// This occurs when last write is acknowledged,
# x! U5 P! e! y. h // and so cannot coincide with a read of the
: J$ u/ m) y& f6 Z // status register B1 ?. N+ |6 t' J
done <=1'b1;0 D1 W% X4 w4 W: i# D4 A$ ^
else # n# L2 W8 p6 T( C
if (cyc_i && stb_i && !we_i && adr_i[3:2] ==2'b00 && ack_o): [0 e* K: B7 o9 C# c% N+ w$ t
done <=1'b0;# k% n. i$ D- k$ N1 Y
! K g" }& R, C/********************************************************************/
) G6 @( ^0 i D3 [ ( y4 }- y# h. c/ X+ W+ z1 x
assign int_req = int_en && done;* T" F5 u" A7 `' P, N/ ^
/*********************************************************************/* @8 r& `, d. ?: f# P% v$ {
always @(posedge clk_i)1 t+ y: q4 H# B- A
ack_o <= cyc_i && stb_i && !ack_o;0 I) g4 ^& a0 B& `7 {
/*********************************************************************/ ; n! R* p9 |$ | y
always @*
( `6 Y% o- V+ R2 e+ u if (cyc_i && stb_i && !we_i)" W' Y% J; [8 L+ e5 g" A& C( }- n1 j
if (adr_i[3:2] == 2'b00)
0 V6 R; J$ a1 C) r! g dat_o = {31'b0,done};// status register read- Z6 X5 K# G5 w9 C0 W# `
else
" Q# i6 C6 ~/ R" Y) \ dat_o = 32'b0; // other registers read as 0& X" V: N: |# v9 n0 I
else
7 r6 G2 E4 h0 ` dat_o = result_row; // for master write
* D0 ~! O3 S4 i, W, h qendmodule
* G3 z+ [) `- `9 v; Z: |$ A& e5 k5 Z
) t' ~7 F' V1 @- }' `
3,硬件的功能仿真
$ L* v6 U- f* ]; _" ^在进行仿真之前,我们需要编写相应的仿真模型和相应的图片预处理,后处理程序。4 a# N3 M+ T h5 _
' @5 t/ Q& f) F
下面是前仿真的结果:从中我们可以看出硬件的执行时间是23562270个cycle,约23.56ms。8 X& s, j/ e( R$ `# ^
( i$ C4 n/ {( s: f+ w& x/ s
, p3 K0 E8 ^3 K' }' H9 q
5 X. `6 [) [ B& |( A
' e4 V9 ~) n9 w% p) W6 q/ R
+ q& T3 n; D2 j在做功能仿真时,我们就可以得到一个处理结果了,原图和sobel处理结果对比如下:
: F, c% E$ U }; N
( ]/ x2 Z9 w7 @( f
: N! e6 V) l" Z! z3 N, `4 {: K+ W
/ ^$ u- T5 w) }9 O S y3 ?1 v/ t. E
' D$ p3 E% z& ?
! L+ c! z7 B8 J功能仿真的具体操作步骤,请参考bmp_post目录下的README文件。内容如下:
8 f. D, Q4 ~! w$ j& P5 Z1 \0 r1 ^/ L. P+ y3 _+ {5 J! @' w
9 o& W: v8 n+ F1 y p
; p) @3 K' o7 b2 K3 d# L/*
# a+ N! k9 R* v1 j! H* Rill create4 B8 V- B9 R8 @
* rillzhen@gmail.com) z; }4 G0 C" h
*/
, \1 o, g9 L' ~3 W1 k2 p7 S
$ t5 Q, @- M* q5 ?1,保证原图为1024x768的8-bit的sobel.bmp
9 _6 ]) K1 |- l+ u6 S2,运行read_bmp.exe,生成bmp_dat.txt
1 X+ K& h1 `: E0 U- |6 V2 x3,将bmp_dat.txt复制到verilog工作目录,运行仿真,生成post_process_dat.txt
) b6 P+ w6 G" y) T& f( q% g: {4,将post_process_dat.txt复制到本目录,先运行bmp_process.exe,保持bmp_process.exe运行状态,再执行bmp_bin.exe,最终生成sobel_rst1.bmp,为最终处理结果。4 Y) b7 H1 e1 S- S
1 R' M$ g1 d( G+ O) L5 {0 P' G: z1 [7 x' m1 V! u
需要注意的是,本例中处理的图片为1024x768的8-bit的bmp文件,如果想处理其他尺寸的bmp文件,需要修改一下testbench.v中的相关内容即可,如果想处理其他格式的图片,则需要重新编写对应的图片预处理,后处理程序。
* w6 m+ | e" f4 V. h) V
# u$ v, c2 F3 R5 U6 B& i' s
Q, ~) `# R' k2 r D
! t# ~/ O; b$ @$ d C( {% i4,硬件的时序仿真
% w+ M0 ~7 d3 W, A8 d在最终在FPGA上运行之前,我们还需要做一下时序仿真(后仿),这时就需要用quartusII在综合时生成包含延迟信息的文件(*.vo),还需要quartusII的库。( H: L a+ C- E2 a
, K9 o$ p. p' R
我们用quartusII将sobel模块的所用可综合文件进行综合,会生成sobel.vo文件和其它vo文件,比如sobel_min_1200mv_0c_fast.vo。
& T) H2 H6 z) Q5 I" W/ }: L# i2 p$ K. t" Q! N2 q# E
在后仿时,用到的quartus的库文件有:cycloneive_atoms.v , altera_primitives.v。
) p M9 x, A _4 ]) }9 k5 j
8 V5 ?: h% t+ M& `+ J当然,还需要仿真模型文件:testbench.v ,cpu.v,memory.v,arbiter.v。5 U$ T' }+ ]3 ]$ l+ t7 _
' j3 H* j8 \! T
/ l8 H* m" f5 K
: ^6 A- Z( I; j5 e, g2 F经过近半个小时的仿真,结果终于出来了,按照前仿真的步骤,也能得到相同的处理结果。
# C8 Q* e6 H% K* T( x+ h
% ^4 x! V s2 @这里需要注意的是,如果用sobel.vo做后仿时出现hold/setup时序不满足的问题,可以使用sobel_min_1200mv_0c_fast.vo来做后仿。6 X. R& U# w$ @# [: j: n
0 ^$ a' E' V& D8 ~1 m后仿真的工程,我也已上传:, U0 g" s2 j9 D& P& e5 w4 I+ P
! E3 T4 H/ S# u1 |1 Phttp://download.csdn.net/detail/rill_zhen/6371857" y' y; c" b! M
+ @9 o' i5 c/ t: p$ ^( X
% c9 M/ f2 w0 ?6 H7 Z. [; l9 d7 y: ]- t( F" L, z6 u
5,软件实现
0 m5 P2 B0 C z O H3 U为了便于对比硬件的加速效果,我们需要编写相应的软件程序,然后在openrisc上运行,得到运行时间。
) f& K* u4 [8 U3 D3 q0 g0 F* l
- U( D6 `% `* ]: M2 G7 i6 V# n; M下面是C语言实现代码:0 z% F: V+ B! s8 g. x f
$ B3 ?6 F6 r c1 D5 F( i' `" V. ] qbmp.c:
0 g5 }4 A1 I- |6 v+ {& N. a( ]) s* `8 n/ L
" \3 _/ K8 V% z L* a: O/ E0 D/ f9 R) ?1 A# j- L
#include <string.h> # f2 a# e$ c5 B+ C3 a1 u4 Q
#include <math.h>
' `, P1 k) U" y#include <stdio.h> 6 h6 r+ S# ^. ~/ }7 P$ W
#include <stdlib.h>
9 l0 q1 S' `# D% Y% y* E) r% I4 l#include <malloc.h> : f: j9 R" j1 h
#include <sys/timeb.h>2 K! J1 r. ^- s/ |" M4 S: ^
#include <time.h> ( y2 w8 r5 h) @
+ f" L8 l J, [; l3 f* k2 S, @, Btypedef struct {: D1 ], ?' u- B: R% u/ D; z5 |3 @0 M
double real;8 b, G" {0 m% H' b, G. b
double img;
) l7 R7 F: C; M9 \: r } COMPLEX;* Z- p" y2 o, h2 Y, z! W7 n
typedef struct
7 Z1 k- G* Z# V' w% m{3 D( g5 Y3 A( _# \% l1 ~
long tv_sec;
1 h! D: G/ n8 a- e2 G0 U) A; A* x( G long tv_usec;
0 m) o) n& A$ r$ _: x) k& I8 S} timeval;0 W/ x+ h4 Z+ X; J4 K8 D$ D" t& v
0 [+ r; d+ |7 z; g) _/ J# u
: ?' b, O- W( L/ Ztypedef unsigned char BYTE;
7 Y( q! Z) o5 w+ d% Q! [typedef unsigned short WORD; % Y% m' P! I+ V0 i, T" _$ T4 j
typedef unsigned long DWORD;
* g4 x/ x# H$ |5 K4 Q. Otypedef long LONG;
" _+ g* l a0 x% o6 u) g& d6 ?
, R3 q5 \: A3 M* t% V+ B
# p& T( P; b! u$ @; l! M9 W( G- _$ V8 r//位图文件头信息结构定义
9 n& C' [# d ?: I5 |//其中不包含文件类型信息(由于结构体的内存结构决定,要是加了的话将不能正确读取文件信息) " t. W+ u0 b$ n) c" |! `
( y9 A+ s' p4 [" Ntypedef struct tagBITMAPFILEHEADER {2 R3 _- z x6 B: A- f( z7 x
DWORD bfSize; //文件大小
# w+ A+ ?! I U- W$ ~" A0 { WORD bfReserved1; //保留字,不考虑
4 P; l; ?! Q+ k& y" V% I WORD bfReserved2; //保留字,同上 # I0 x- Z+ \- D3 ?
DWORD bfOffBits; //实际位图数据的偏移字节数,即前三个部分长度之和 , \8 h% u" V! t$ m* V$ s6 O. R
} BITMAPFILEHEADER;
* a0 ?! Q# c+ U$ G
% X8 R" G4 {& }; W- Y) U ' S0 a4 b) j4 y( `* s
//信息头BITMAPINFOHEADER,也是一个结构,其定义如下:
& W6 U; F# e( n ; i% M5 O7 h0 D2 @3 X% S
typedef struct tagBITMAPINFOHEADER{ 8 k& m% M# \7 p
DWORD biSize; //指定此结构体的长度,为40
0 o7 O; Z! o* B4 Q LONG biWidth; //位图宽
: h+ e' s% I. ?3 C+ e1 @ LONG biHeight; //位图高 ' {, {# `. c7 d# ]: }
WORD biPlanes; //平面数,为1
6 v( e7 |$ w R# g. _ WORD biBitCount; //采用颜色位数,可以是1,2,4,8,16,24,新的可以是32 - i) z) t/ Y" Q0 u
DWORD biCompression; //压缩方式,可以是0,1,2,其中0表示不压缩 % s% X" n. i% J& l0 U. x/ g
DWORD biSizeImage; //实际位图数据占用的字节数
0 n; G. I2 a3 g5 O: I$ s LONG biXPelsPerMeter; //X方向分辨率 % X9 d9 L+ M. ]* S
LONG biYPelsPerMeter; //Y方向分辨率
! R3 O; t `0 C3 Q' y, z DWORD biClrUsed; //使用的颜色数,如果为0,则表示默认值(2^颜色位数)
* R r% u" B1 L7 t: V" }0 Y DWORD biClrImportant; //重要颜色数,如果为0,则表示所有颜色都是重要的 + B. e) s5 }& m' u* ?" R* Z$ {4 Z
} BITMAPINFOHEADER; ) } w' B, b1 D- Q6 _/ v
$ d/ E& Z1 [2 W3 w: D
9 ^/ y" _$ J; |; D) R//调色板Palette,当然,这里是对那些需要调色板的位图文件而言的。24位和32位是不需要调色板的。 % x2 A( L8 ^/ `* B
//(似乎是调色板结构体个数等于使用的颜色数。)
& Y, M3 P' L$ c
! ]( x9 T6 t4 r0 Wtypedef struct tagRGBQUAD {
- T1 n9 @" `- i6 ~& m8 Y$ g- x1 j6 k BYTE rgbBlue; //该颜色的蓝色分量 ( S j t j" w" z/ C" t' x* m: F
BYTE rgbGreen; //该颜色的绿色分量 7 d. H& T% T+ ^5 a
BYTE rgbRed; //该颜色的红色分量 2 N3 [' f9 v) |5 J0 C
BYTE rgbReserved; //保留值
% \0 a8 \6 I( i, b/ L} RGBQUAD;
* E E; i. x/ ^' h4 H% o0 {. L. X - b3 p1 t3 R4 A% j2 n
9 C; @ T* h" e4 \; ~, F
% Z" v" V: U) \$ _void showBmpHead(BITMAPFILEHEADER* pBmpHead) 1 G' T7 X1 N+ K
{ 0 R m& F7 P7 J% g w8 @ n
printf("bmp file head:\n");
/ u/ \& f0 [* e' N2 s: X printf("size:%d\n",pBmpHead->bfSize);
7 F$ q7 [ p7 l' p printf("reserved byte1:%d\n",pBmpHead->bfReserved1); 3 d) h/ w# N( c* N
printf("reserved byte2:%d\n",pBmpHead->bfReserved2);
/ ]* F; T: w5 [& G4 ~% v printf("offbit:%d\n",pBmpHead->bfOffBits);
5 P& } T( b- q8 c}
: f. t- L/ j7 `9 W4 X9 f
" h* o/ f& ~: i' j6 l! `$ L- A0 p2 g! b
9 H2 m- h! u" K7 I* @- l7 L C. Uvoid showBmpInforHead(BITMAPINFOHEADER* pBmpInforHead)
/ m0 |. f ]) K. ~' t, T. P' x{ " i) D/ b& d* X9 W
printf("bmp info head:\n");
% l2 j8 M' R7 q: G: `5 {, r" r( R7 F! @ printf("structure size:%d\n",pBmpInforHead->biSize); 4 l- K4 L% e2 `# v9 N8 S6 g
printf("width:%d\n",pBmpInforHead->biWidth);
: ^& _6 R* u! |! c printf("height:%d\n",pBmpInforHead->biHeight);
v# B& T; E3 F- v4 ] printf("biPlanes:%d\n",pBmpInforHead->biPlanes); 5 ^, o" M' B" _" V3 O
printf("biBitCount:%d\n",pBmpInforHead->biBitCount); - L8 J& d F2 P6 l+ [, C! r( s/ B
printf("compress type:%d\n",pBmpInforHead->biCompression); / C( W7 {( ^- i( Q8 Z4 _/ L7 h \
printf("biSizeImage:%d\n",pBmpInforHead->biSizeImage);
# ]% p# ? {! M* N' X printf("X:%d\n",pBmpInforHead->biXPelsPerMeter); ( ^* \$ a- d u5 {( f( f6 K n
printf("Y:%d\n",pBmpInforHead->biYPelsPerMeter);
( B/ d& Z: i3 C" N2 k+ V1 Q0 G printf("colour used:%d\n",pBmpInforHead->biClrUsed);
! d4 `% R G0 ?) C4 p printf("imp colour:%d\n",pBmpInforHead->biClrImportant); 3 O5 d" p7 A$ U. Z/ S
} . L/ J; r. @2 N5 O
! K- z- q0 i% m! hvoid showRgbQuan(RGBQUAD* pRGB) ' \. J, u# D1 J8 @& W! E% X6 o
{
1 p9 w2 |# w2 {/ k printf("(%-3d,%-3d,%-3d) ",pRGB->rgbRed,pRGB->rgbGreen,pRGB->rgbBlue);
9 Z9 X9 n- ~4 k( w}
# h2 _9 p# ?! w9 @ # C& k7 [" G/ J5 V' L
void sobelEdge(BYTE* pColorData, int width,BYTE* lineone,BYTE* linetwo,BYTE* linethree);1 k9 C: i3 C& W8 F0 b/ Y1 x* y- x, r
# D( \$ V5 R; s) n: m8 c" c' F z$ k/ n
int main() * }2 q4 |2 s5 _$ ?) k' ~- N8 ]4 `
{ % O `+ [$ ?5 e; E
timeval tpstart, tpend;
& G! J' g! \) S+ H: @( i7 K double timeuse;
! Y' I) X. j( ?. c
' {% M/ Z! X5 U0 _/ U2 l BITMAPFILEHEADER bitHead; ' N: d1 y0 c# \3 E" [$ c
BITMAPINFOHEADER bitInfoHead;
' s4 ~; p9 Z5 f FILE* pfile;
2 d+ p3 G4 D8 c% X' g4 m/ l& @ WORD fileType;
9 i1 ~% D) J) ?+ u3 c
+ ?1 g" |0 ^! v( _; vgettimeofday(&tpstart,NULL);7 _) j1 F& S @( G* v
) g7 Z' T2 V1 N pfile = fopen("./aa.bmp","rb");//打开文件
7 @' _# ~" P4 A" q if(pfile!=NULL) ' M0 W) I; u' _
{
: U8 U! A0 q6 Z. e printf("file bkwood.bmp open success.\n");
5 N2 e8 X: C! U1 ? //读取位图文件头信息
; C% G$ Z$ `; n" [ fread(&fileType,1,sizeof(WORD),pfile); , z2 z* w7 p" ]8 E2 Q
if(fileType != 0x424d/* 0x4d42*/)
0 x5 R% N+ v: h, X9 P, ^' x, H { , P6 `9 Z5 o* m
printf("file is not a bmp file!"); @* H' L, E4 K: v [3 B4 F' U
//return 0; % K1 p. G% d' g
} 6 m& v- o3 S. L2 ~/ h5 ]
) c3 K+ _- w" H+ L
fread(&bitHead,1,sizeof(BITMAPFILEHEADER),pfile);
# o) d# E4 B* O( A# H7 @
, G3 F/ ~/ U4 P+ P5 h R showBmpHead(&bitHead);
7 ~, C5 d1 T+ \2 D printf("\n\n"); 4 G. C' m5 @6 k6 C& j3 U
. f4 a7 C$ f8 g& q) ] //读取位图信息头信息
+ v8 |" A, r$ H1 |, w fread(&bitInfoHead,1,sizeof(BITMAPINFOHEADER),pfile);
/ e8 j7 m4 y4 `4 N6 H! ~9 ] showBmpInforHead(&bitInfoHead);
- t r2 z1 T& Y2 J- U printf("\n");
3 G1 F, r0 t1 ~1 C; d; }8 m) G } 8 F7 v1 z, x) u4 O0 q
else
2 u1 g" I# _1 h6 F$ Y0 G0 y {
( x4 k- R# b# D& q8 z printf("file open fail!\n");
$ {' ?/ i- Q3 A# U! w ! |' W7 x5 h2 {$ A- X
return 0; $ v1 Z" z* m% q
}
3 w; v$ f5 U8 T, M / z, s3 T# M: x( Z& F& j$ ]
( e+ S$ y8 O. a" d+ ^ if((bitInfoHead.biBitCount!=0x800) && (bitInfoHead.biBitCount!=0x8))
- [+ q' I3 T) ?9 \, ^" g1 O { @ h, m+ x2 w7 J" t3 D( V8 N3 E
printf("not 256 colour bmp error!");
( _; D. M+ D4 M; K6 v- h* f // return 0;
1 u8 ?! X i# `8 W, M; L2 E }+ y; P% W4 U, t6 d5 i/ z9 @; b. ~
& Y. n* I7 S" j, [7 u1 \ FILE *pwrite=NULL;//写sobel之后的图像9 V- P0 E, H# |# Q9 t1 |, |6 y
pwrite=fopen("Sobel.bmp","wb");
& {: N& x3 g' u& J- h! Q# D if(pwrite==NULL)8 d: A5 }! N$ l
{% b+ r5 ~- W* n7 K5 f7 Z8 W
printf("new file fail!");# |) M- Z% W/ H7 |- C5 H, H$ p; K
}
' Z; t% r E L2 H/ `- C fwrite(&fileType,sizeof(WORD),1,pwrite);
- m7 y e$ u% N3 q5 k- y fwrite(&bitHead,sizeof(BITMAPFILEHEADER),1,pwrite);
+ t/ L# @5 W' Q5 L fwrite(&bitInfoHead,sizeof(BITMAPINFOHEADER),1,pwrite);5 N3 U5 g: r, X2 f
3 |5 E' \6 o# N1 H h2 r* X& h% Z b7 r 6 Z" J& l& c. [7 ?6 L) n/ f
//读取调色盘结信息; J, k0 K, F1 ]' G, i' p5 B# i& }5 D
0 L# H8 F- ^" f3 e: G RGBQUAD *pRgb ;
) d6 D! ?3 C7 [ int i;) r- o- n3 z) ]6 U3 C- A: h
int rgb2gray; 7 r* K1 B! ]( l/ E
long nPlantNum = (long)pow(2,(double)8/*bitInfoHead.biBitCount*/);// Mix color Plant Number;
' r d5 M7 Y# j. \ " } O" f9 h) S/ E! N! J/ U
pRgb=(RGBQUAD *)malloc(sizeof(RGBQUAD));
# I! J2 I( A" C2 P- G for(i=0;i<=nPlantNum;i++)3 K' Y$ R, ?. D! S! o
{! z' r1 ?3 R0 x5 I% k' }
. z, ?, I, ]* u" B0 g
memset(pRgb,0,sizeof(RGBQUAD));
$ ]4 G s5 A; _* V; q int num = fread(pRgb,4,1,pfile); 2 I9 \6 q4 G; Y1 I* f8 ~7 ~+ n
rgb2gray=(300* pRgb->rgbRed+590*pRgb->rgbGreen+110*pRgb->rgbBlue)/1000;+ N0 M- b1 K8 z7 ^* K) o- P- F/ J
pRgb->rgbRed=rgb2gray;
& n W _ W! a( C; _1 [ A pRgb->rgbGreen=rgb2gray;
! J9 G4 X' E1 }6 b pRgb->rgbBlue=rgb2gray; & w/ X% l9 I3 k0 `
fwrite(pRgb,4,1,pwrite);
# a! N2 |( Q- @0 M& M) _% ] }+ k0 H4 z3 T+ b
6 d/ p$ o' `+ ~( l* B4 }
int width = 1024;//bitInfoHead.biWidth; ( f% g7 ~: {) x+ r9 m0 d
int height = 768;//bitInfoHead.biHeight;
* S# F& x) ~" a5 P6 J9 o$ X- N
) z( |- {( B' P! g8 h$ n6 B3 A/ C
- j: P3 P9 M* Y( h, _4 n; Z2 `; J
- A9 q2 H# X r T3 d3 y BYTE *pColorData=(BYTE *)malloc(width); //ad ! _3 u7 l' i; k8 e
memset(pColorData,0,width);//ad5 T# d' E7 Y) \2 b" m
8 C( J( A( b% n6 N BYTE lineone[1024];- ?; n- A' }' i, |2 W) e
BYTE linetwo[1024];
: v& {) i# u. K) b0 P; L BYTE linethree[1024];1 W7 U8 f+ r0 z3 W& x% Q) }
. c7 v' r* }. K: J5 D int num=0;. t' j# Z/ W9 F6 W& i
int j=0;
: w7 U' S. U- e- {' O2 d1 ] for(num=0;num<height;num++)
9 J- n! p2 c3 X( r! ?! f {
5 G; m9 U& j1 U) h5 n8 i fread(pColorData,1,width,pfile);4 S8 V1 j2 i" z1 ^( P
if(num==0)
7 y) A' u' n% w" P+ H: M/ p {8 X: M) d x S) z4 [
for(j=0;j<width;j++)* e4 o* G) v3 o+ h" _: m* O
{$ e8 e( F/ ^: c: Q! K/ X9 ]1 Z
linethree[j]=pColorData[j]; & b0 R7 Y3 z0 G7 Q+ j% Q
pColorData[j]=0;2 b4 V: u9 _& m8 U u
}1 W& |1 ], {% R8 H6 B% q
fwrite(pColorData,1,width,pwrite);: G/ F8 z0 m7 O+ L& o8 Q" N
}
- v) [6 R' E0 u if(num==1)- @% \5 d- @+ L
{
! p$ \9 P, w' n* T+ d8 S: z for(j=0;j<width;j++)( _6 g8 D) o' ]' O" M1 o0 Z
{( C# I% b8 `0 c1 i3 V$ a
linetwo[j]=linethree[j];/ i# s# j6 X/ Z
linethree[j]=pColorData[j]; 2 N h( D! [+ E# W8 x1 ^; V
}! a; ?8 I0 y6 j+ @& v; d$ d
} ! e: i0 N" i- O. w: N# x9 |
if(num==height-1)
5 C; \) m$ a) {7 b. G for(j=0;j<width;j++)
$ |9 T5 V$ D# h' J3 w R {
c$ S6 c- D6 ?: Y8 w pColorData[j]=0;$ O9 b8 k8 Q. J& u# _5 ~1 B
fwrite(pColorData,1,width,pwrite);; `4 J9 G: Q2 w; [4 q: F
}
w3 I; G# U T( B% n: O# z else# j. ^/ R2 k7 k( U8 B% z# {
{ 6 v+ O: Y/ ^; D1 y
for(j=0;j<width;j++)+ u8 E" l1 X `3 \
{# B) n8 k4 b( h: E
lineone[j]=linetwo[j];
) T& P7 T: n$ N5 G- O$ U: ^ linetwo[j]=linethree[j];' a7 w& J7 H) k- g, N% @: X0 ?' z! a) E
linethree[j]=pColorData[j];/ ?5 u3 |. d: h: _
}! r X2 ?% a( N: I. ]2 r& \7 J- n
sobelEdge(pColorData,width,lineone,linetwo,linethree);& b8 T9 J0 g2 C! u }
fwrite(pColorData,1,width,pwrite);
; J( I4 E6 C8 B5 |1 `: H" g6 _9 i } # `2 g5 q4 C' w8 h7 o! Q
}
2 F& H# ]" m' G6 k. g9 ^8 g fclose(pwrite);& C4 h% j2 K! W7 I* I
fclose(pfile);
- S/ O; |4 }# V1 a. t if (bitInfoHead.biBitCount<24)
/ i3 E R0 M8 l( ? v3 o8 Q) Z { . E9 S* k, B' Q2 M. j' ~
free(pRgb);
7 @0 |/ A4 ~9 n$ g }
' B, O Z$ k/ d$ B4 d4 ~ free(pColorData); / K! b2 M& A- U, ]
4 Y; ]: P v- O. L1 ? gettimeofday(&tpend,NULL); 6 }4 u- N& N! A, {! k1 L% E
timeuse=1000000*(tpend.tv_sec-tpstart.tv_sec)+tpend.tv_usec-tpstart.tv_usec;# _# ^3 y: j, Z2 X% T3 p n6 q
printf("sobel_soft Used Time us:%lf\n",timeuse); ( q" k* b- \9 I. a
return 1;! v0 r- w" |! [8 P+ G+ v
}
7 N: K, _1 v+ i% X8 Y* V/ lvoid sobelEdge(BYTE* pColorData, int width,BYTE* lineone,BYTE* linetwo,BYTE* linethree)" A! _% j2 k+ ], a% |: H
{ s3 T6 D1 [& o
BYTE area[3][1024];
" J) u& x/ S1 m o2 H# c: O, P! F; W' h int i=0;
" [. F9 X' |( P# I: i int j=0; I# j1 f- @* k& ^$ K
for(i=0;i<3;i++)
]" g! w6 X) B5 X1 x% i. ?! _6 j for(j=0;j<width;j++)
7 n2 k2 f9 h1 E7 }$ z {
8 F e( v( Z* Y) [8 \ if(i==0)
! q \) T9 @0 Z& P9 y9 J. N- F area[j]=lineone[j];
' I0 m3 T* d2 g/ e( Y; D- P1 D4 I if(i==1)
0 |. J1 u; |7 n; J- ~$ S area[j]=linetwo[j];! M4 u1 k- S) u, G
if(i==2)
* c- n. \9 t1 V% l8 u2 y+ B9 T area[j]=linethree[j];+ |6 v5 K0 n! s" a: N; `
}
* s7 M" l4 }. x& @ d! T int temp1=0;
& y) s: f# X1 A- e: F2 x int temp2=0;
0 b# ~, d t, P& p' l% E int temp=0;
% P S7 O+ n- ~+ C9 A. u int *tempM=(int *)malloc(3*width*sizeof(int));
a3 ^& o/ r3 Y/ R memset(tempM,0,3*width);5 q6 ?% l0 J. b3 s
! k* X& r) `. M6 R8 v1 @ & q* `' v8 K8 Y+ R" t6 A9 Q) M
int m=0;; q6 V5 H9 H$ C# N( ~8 w5 D0 F
int n=0;
0 T) j n9 m6 N1 t for(m=0;m<2;m++)
@: S6 w1 p, B% W/ q5 J for(n=0;n<width;n++)7 @$ |9 }& I Z" F& [9 D/ m
{
( U* V: w- ^5 b$ N if((m==1)&&(n!=width-1)&&(n!=0))
2 F2 i, q4 i0 C t0 J0 |2 ]9 \ {: v7 g9 `& c2 B! v S" _
temp1=area[m+1][n-1]
: }$ J+ x2 _, O& P$ h6 M: b +2*area[m+1][n]
/ K6 c5 w$ T% Y* s% m +area[m+1][n+1]2 U1 ~0 \: c2 g0 p: z ^# @: m
-area[m-1][n-1]
! S( ?, p; w; R5 U; p% k4 z. f -2*area[m-1][n]
0 }9 @4 F, \9 L8 h) i5 h: E$ P -area[m-1][n+1];( B: K5 q6 h. d; @0 T8 B# z' ?
temp2=area[m-1][n+1]
- }6 z# W& q% o1 x +2*area[m][n+1]
% r+ R. ?6 B% f3 o4 @5 Y0 Y +area[m+1][n+1]
: S! N; X+ }# I' z, o -area[m-1][n-1]
: _3 `. t2 ^) w! g -2*area[m][n-1]
6 n% X L3 t: }4 d" b6 Q" K6 n/ h7 c -area[m+1][n-1];
: r1 O1 Z! o6 O // temp=(int)((double)sqrt((double)(temp1*temp1))+(double)sqrt((double)(temp2*temp2)));8 s* f S2 w" t/ I
temp = (int)(abs(temp1) + abs(temp2));+ X, x& B% ], g
if(temp>255)
! s1 ^0 p8 z/ b( [1 t% D1 X {/ C# r( a' C3 [ L/ }
temp=255;
0 V7 O7 G; y( B$ [, X: g( T }
; d- ?8 f1 W! g9 w! s2 s# S& u! \ if(temp<0) g! @! A7 S4 Q
{
2 k" c! }6 B) g- }* E$ f- h# \ temp=0;9 J8 b9 o5 H5 C
} 5 t, ?2 ~# ^0 S- z% A
tempM[m*width+n]=(BYTE)temp;
$ v& n: X! b( X }
0 { ~4 `8 q# M" T8 Y' c B else. s1 f. W* K% e/ ^( U- c
{
7 `% |' C" K7 d9 o area[m][n]=0;
5 M+ E- H4 e* n4 F3 \. ~ temp=area[m][n];
0 g0 s+ J1 r0 {& g, ~ w }
6 w% B* y+ [+ F% f L
0 g7 ]$ F$ ?$ {8 H. f5 d }
; K0 b( f% y0 E: P for(n=0; n<width; n++)
3 W7 K* t- k1 M {
8 J; F: l& r& B2 E pColorData[n]=tempM[1*width+n];4 N" S( a7 w$ G
}- f8 G! W0 a. V; Q( o) ?# t# z
free(tempM);
0 t0 ^6 X& v1 y. p return;, k% |* f0 o0 R( X
}% Y( U7 u5 a) m! W" @6 d
+ m+ Z; w! ~6 x; `9 k7 w
7 j3 }2 M5 e! N6 R" X
% a; @6 l, f; N l" c2 q0 s1 x) g6 n/ e- y+ y& j. f. Y$ v
在虚拟机下编译:or32-linux-gcc bmp.c -o sobel1 e" h$ y# T& `5 x8 G
" Q( | A+ m+ p) b Z/ |copy到开发板上执行,得到打印结果如下:从中可以看出,软件执行时间为17972.718ms。时间对比=762.76倍。硬件加速效果明显。当然,实际情况比较复杂,不可能会达到这么多。
" b: K: s- Z8 t. c+ x6 {8 o" P% ~
8 ~% p3 ?. l5 {; L9 o
9 o) C2 I3 W% m. s$ s
0 N& {* B2 `9 X4 ]
9 A6 h: K+ L& C" a4 e) J* B& x3 I: o
% u% G- {; ?, j P软件处理结果:与硬件处理结果有所差别,原因是处理算法和预处理有所区别,但区别不大。
, w' m2 r& f& N% P5 \, l% n9 z5 T" R# j5 _3 y8 S4 D; v
# R) |% q9 s( h4 [* C* m1 ^- V
9 o& W* a' g. `4 X Z7 e, D5 z" ]: o8 b8 W8 R
1 Y* F5 P" N: m* Q7 c& z6,小结
+ @5 ]: |9 L' a7 h7 x* n: E本小节设计实现了一个简单的sobel减速器,并编写了相应的软件,与之对比,加速效果明显。
, `- n0 Z7 u3 {5 h8 u% j `" U/ ?" e! Q- o( w# y4 F0 P* \
) [* H( p/ K" G# Q) ~
9 g4 Q; H; g, \9 ]
6 g) B5 D$ C9 V' P
$ f- o/ _+ z6 r% [8 W" p& \, H |
|