Matlab应用之GPU加速

haidaowang · 发表于 2020-1-20 09:44

EDA365欢迎您登录！

您需要登录才可以下载或查看，没有帐号？注册

x

由于GPU近几年地迅速发展，GPU在多线程计算等方面逐渐超越CPU成为计算的主力军。而Matlab是常用的数学应用软件，现在讲解一下如何在Matlab中使用GPU加速计算

文章目录
   0. 必要条件
   1.给GPU传输数据
            1.1 CPU的数据复制到GPU
            1.2 直接在GPU上设置数据：
   2.数据在GPU上运算
   3.GPU数据回传
   4.使用技巧
            4.1 如果没有并行计算不建议使用GPU
            4.2 如果没有Nvidia显卡或者显卡驱动
            4.3 双精度尽量转换为单精度
   附录

0. 必要条件

要想在Matlab中使用GPU加速有两个必须的条件

计算机上安装了NVIDIA显卡，目前AMD与Intel显示是暂不支持的；
安装NVIDIA显卡驱动。
$ A: ^) b, S6 r _' D

1.给GPU传输数据

1.1 CPU的数据复制到GPU

在使用GPU计算的时候，只需要将CPU的数据复制到GPU中即可。

G = gpuArray(M);
( {' h8 l0 D1 c' v6 |: {6 V

上边是对数据的名称做了修改，也可以直接进行重新赋值。

M = gpuArray(M);, D. B6 L3 c" S& M4 ]

1.2 直接在GPU上设置数据：

A = zeros(10, 'gpuArray');& R1 }& P! ~; l& e6 A) q7 I* i# }

可以对0矩阵以及1矩阵直接进行复制，但是在程序后边需要标注使用gpuArray。

r = gpuArray.rand(1, 100) % 一行，一百列
- Z: h6 X9 Y! Z# `

随机矩阵的产生。

2.数据在GPU上运算

在GPU可以正常运行基本的运算，与正常矩阵计算方法相同

A=abs(A);
2 k, S, c' s4 B

具体的可以运行的运算可以使用命令

methods(gpuArray)
J2 J4 J0 i: O1 t

进行查看，Matlab可以在GPU运行的具体运算可以查看附录，附录中是Matlab给出的结果。

3.GPU数据回传

B = gather (A);
8 a: p: F* L* I+ V" D

直接使用上边的命令就能够将GPU中的数据回传给CPU。

4.使用技巧

4.1 如果没有并行计算不建议使用GPU

index = 0;
index = gpuArray(index);
for i = 1 : 10000
tic
for j = 1 : 100000
index = index + 1;
end
toc
end
disp(index)6 i2 T' ~; y1 s$ S- N) Y

通过上边程序第二行程序就可以在GPU上运行，注释掉就会在CPU上运行。在我的电脑上运行时间如下表所示。

设备 CPU GPU
时间 0.00010 s 1.973017 s

由此可以看出，单个线程运行程序还是建议在CPU上运行，CPU的主频还是高一点，GPU主要是支持多个线程同时运行。

4.2 如果没有Nvidia显卡或者显卡驱动

如果没有Nvidia显卡或者显卡驱动，会显示下边的提示。

4.3 双精度尽量转换为单精度

在条件允许的情况下，尽量将计算过程中双精度转换为单精度。因为在GPU中单精度的计算速度明显优于双精度，在时间上会有很大的缩减。
附：单精度与上精度区别

数据类型                   大小(字节)                   取值范围                保留有效位数
单精度                4个字节（32位）       3.4E-38～3.4E+38             7位
双精度                8个字节（64位）       1.7E-308～1.7E+308          16位

附录

>> methods(gpuArray)
Methods for class gpuArray:
abs eq ipermute quiver3
accumarray eRF iradon rad2deg
acos erfc isaUnderlying radon
acosd erfcinv isbanded rdivide
acosh erfcx isdiag real
acot erfinv isempty reallog
acotd errorbar isequal realpow
acoth existsOnGPU isequaln realsqrt
acsc exp isequalwithequalnans reducepatch
acscd expint isfinite reducevolume
acsch expm isfloat regionprops
all expm1 ishermitian rem
and eye isinf repelem
angle ezcontour isinteger repmat
any ezcontourf islogical reshape
applylut ezgraph3 ismember rgb2gray
area ezmesh ismembertol rgb2hsv
arrayfun ezmeshc isnan rgb2ycbcr
asec ezplot isnumeric ribbon
asecd ezplot3 isocaps roots
asech ezpolar isocolors rose
asin ezsurf isonormals rot90
asind ezsurfc isosurface round
asinh factorial isreal scatter
assert false issorted scatter3
atan feather issparse sec
atan2 fft issymmetric secd
atan2d fft2 istril sech
atand fftfilt istriu semilogx
atanh fftn kmeans semilogy
bandwidth fill knnsearch setdiff
bar fill3 ldivide setxor
bar3 filter le shiftdim
bar3h filter2 legendre shrinkfaces
barh find length sign
besselj fix line sin
bessely flip linspace sind
beta flipdim log single
betainc fliplr log10 sinh
betaincinv flipud log1p size
betaln floor log2 slice
bicg fplot logical smooth3
bicgstab fprintf loglog sort
bicgstabl full logspace sortrows
bitand gamma lsqr sparse
bitcmp gammainc lt spfun
bitget gammaincinv lu spones
bitor gammaln mat2gray sprand
bitset gather mat2str sprandn
bitshift ge max sprandsym
bitxor gmres mean sprintf
bsxfun gop medfilt2 spy
bwdist gpuArray mesh sqrt
bwlabel gradient meshc stairs
bwlookup gt meshgrid std2
bwmorph head meshz stdfilt
cast hist min stem
cat histc minres stem3
cconv histcounts minus stream2
cdf2rdf histeq mldivide stream3
ceil histogram mod streamline
cgs horzcat mode streamparticles
chol hsv2rgb movmean streamribbon
circshift hypot movstd streamslice
clabel idivide movsum streamtube
classUnderlying ifft movvar stretchlim
comet ifft2 mpower sub2ind
comet3 ifftn mrdivide subsasgn
compass im2double mtimes subsindex
complex im2int16 nan subspace
cond im2single ndgrid subsref
coneplot im2uint16 ndims subvolume
conj im2uint8 ne sum
contour imabsdiff nextpow2 superiorfloat
contour3 imadjust nnz surf
contourc imag nonzeros surfc
contourf image norm surfl
contourslice imagesc normest svd
conv imbothat normxcorr2 svds
conv2 imclose not swapbytes
convn imcomplement nthroot symmlq
corr2 imdilate null tail
corrcoef imerode num2str tan
cos imfill numel tand
cosd imfilter nzmax tanh
cosh imgaussfilt ones tfqmr
cot imgaussfilt3 or times
cotd imgradient padarray transpose
coth imgradientxy pagefun trapz
cov imhist pareto tril
csc imlincomb patch trimesh
cscd imnoise pcg trisurf
csch imopen pcolor triu
ctranspose imreconstruct pdist true
cummax imregdemons pdist2 typecast
cummin imregionalmax permute uint16
cumprod imregionalmin pie uint32
cumsum imresize pie3 uint64
curl imrotate planerot uint8
deg2rad imrotate_old plot uminus
del2 imshow plot3 union
det imtophat plotmatrix unique
detectFASTFeatures ind2sub plotyy uniquetol
detectHarrisFeatures inf plus unwrap
detrend inpolygon polar uplus
diag int16 poly var
diff int2str polyder vertcat
discretize int32 polyfit vissuite
disp int64 polyval volumebounds
display int8 polyvalm voronoi
divergence interp1 pow2 waterfall
dot interp2 power xcorr
double interp3 prod xor
edge interpn psi ycbcr2rgb
eig interpstreamspeed qmr zeros
end intersect qr
eps inv quiver
Static methods:
colon rand randperm
freqspace randi speye
loadobj randn* W( o( T5 `% V+ w6 O1 e

CCxiaom · 发表于 2020-1-20 18:00

Matlab应用 GPU加速

ExxNEN · 发表于 2020-1-21 17:51

Matlab应用之GPU加速

帐号		自动登录	找回密码
密码			注册

Matlab应用之GPU加速

EDA365欢迎您登录！

浏览过的版块

推荐内容 /1