1789 1

[Weka及其他] MATLAB课程:代码示例之C Code Generation(四) [推广有奖]

企业贵宾

巨擘

0%

还不是VIP/贵宾

-

威望
4
论坛币
624047 个
通用积分
178.9606
学术水平
918 点
热心指数
987 点
信用等级
841 点
经验
398972 点
帖子
9798
精华
48
在线时间
17322 小时
注册时间
2014-8-19
最后登录
2022-11-2

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

MATLAB课程:代码示例之C Code Generation(四)


Simulation Acceleration using System Objects,

MATLAB Coder and Parallel Computing Toolbox


This example shows three ways to accelerate the simulation of communications algorithms in MATLAB®. In particular, it showcases the effects that using System objects, MATLAB to C code generation, and parallel processing runs (using the MATLAB parfor function) have on simulation speed.

The combined effect of using these methods may speed up a typical simulation time by an order of magnitude. The difference is tantamount to running the simulation overnight or within just a few hours.

The System objects this example features are accessible in the Communications System Toolbox™ product. In order to run the MATLAB to C code generation section of this example, you must have a MATLAB Coder™ license. In order to run the parallel processing section of this example, you must have a Parallel Computing Toolbox™ license.


Introduction

This example examines various implementations of the following transceiver system:


This system is composed of a transmitter, a channel model, and a receiver. The transmitter processes the input bit stream with a convolutional encoder, an interleaver, a modulator, and a MIMO space-time block encoder [ 1 ], [ 2 ]. The transmitted signal is then processed by a 2x2 MIMO channel and an additive white gaussian noise (AWGN) channel. The receiver processes its input signal with a 2x2 MIMO space-time block decoder, a demodulator, a deinterleaver, and a Viterbi decoder to recover the best estimate of the input bit stream at the receiver.

Structure of the Example

The workflow of this example is as follows:

  • Start with a function-based algorithm as your baseline

  • Use the MATLAB Profiler GUI to identify speed bottlenecks

  • Improve the simulation time using System objects

  • Accelerate the simulation with MATLAB to C code generation

  • Achieve even faster simulation using parallel processing runs


Start with a Function-based Algorithm as your Baseline

Start with a function that represents the first version or baseline implementation of this algorithm. The inputs to the commaccelerationbaseline function are the Eb/No value of the current frame (EbNo) and the number of bits the algorithm processes (MaxNumBits). Eb/No is the ratio of energy per bit to noise power spectral density. The function output is the bit error rate (BER) of the algorithm. Note that the baseline algorithm implementation uses functions and data objects from the Communication System Toolbox, along with some user-defined functions, such as localmimoencoder, localmimodecoder and localmimochannel.

type commaccelerationbaseline


function ber = commaccelerationbaseline(EbNo, maxNumBits)% COMMACCELERATIONBASELINE Simulate a communications link with functions.%   BER = COMMACCELERATIONBASELINE(EBNO, MAXNUMBITS) returns the bit error rate%   (BER) of a communications link that includes convolutional coding,%   interleaving, PSK modulation, an Alamouti space-time block code, and a MIMO%   channel with AWGN.  EBNO is the Eb/No of the AWGN channel in dB, and%   MAXNUMBITS is the number of bits simulated before the simulation stops.  The%   simulation uses functions to implement the above operations.%   Copyright 2011-2014 The MathWorks, Inc.rng defaultFRM = 2000;                                              % Input Bit FrameM = 4;                                                   % Modulation Orderk = log2(M);                                             % Bits per SymbolcodeRate = 1/2;                                          % Coding RateadjSNR = EbNo - 10*log10(1/codeRate) + 10*log10(k);tblen = 32;trellis = poly2trellis(7,[171 133]);LEN=4012;permvec=[1:3:LEN 2:3:LEN 3:3:LEN]';%% Run SimulationtotErrs = 0;totBits = 0;while (totBits <= maxNumBits)    data = [randi([0 1],FRM,1);false(6,1)];    u1 = convenc(data,trellis);                    % Convolutional Encoder    u2 = intrlv(double(u1),permvec');              % Interleaver    u3 = bi2de(reshape(u2,FRM+6,2));               % Convert to M-ary symbols    u4 = pskmod(u3,M,pi/4,'gray');                 % QPSK Modulator    u5 = localmimoencoder(2,u4);                   % Alamouti Space-Time Block Encoder    [u6, u7] = localmimochannel(u5,2);             % 2x2 MIMO Channel    u8 = awgn(u6, adjSNR, 'measured', [], 'dB');   % AWGN channel    u9 = localmimodecoder(u8,u7,2);                % Alamouti Space-Time Block Decoder    uA = pskdemod(u9,M,pi/4,'gray');               % QPSK Demodulator    uB = reshape(de2bi(uA),LEN,1);                 % Convert to binary    uC = deintrlv(uB,permvec');                    % Deinterleaver    uD = vitdec(uC(:),trellis,tblen, ...           % Viterbi Decoder        'term','hard');    bitErrs = biterr(uD(1:FRM),data(1:FRM));    totErrs = totErrs + bitErrs;    totBits = totBits + FRM;endber = totErrs/totBits;


As a starting point, measure the time it takes to run this baseline algorithm in MATLAB. Use the MATLAB timing functions (tic and toc) to record the elapsed time for running this function, as it is called within a for-loop that iterates over Eb/No values from 0 to 7 dB.

MaxSNRdB=7;EbNo=1;MaxNumBits=2e5;N=1;str='Baseline';commaccelerationbaseline(EbNo,1e4);berBaseline=zeros(size(0:MaxSNRdB));fprintf(1,'Processing the baseline algorithm.\n');tic;for EbNo=0:MaxSNRdB    y=commaccelerationbaseline(EbNo,MaxNumBits);    berBaseline(EbNo+1)=y;enda=toc;


Processing the baseline algorithm.


The result shows the simulation time (in seconds) of the baseline algorithm. Use this measurement as a yardstick to compare with subsequent versions of the algorithm.

commaccelerationreportresults(N,a,a,str);


----------------------------------------------------------------------------------------------Versions of the Transceiver                         | Elapsed Time (sec)| Acceleration Ratio1. Baseline                                         |           48.1736 |       1.0000----------------------------------------------------------------------------------------------


Use the MATLAB Profiler GUI to Identify Speed Bottlenecks

Identify the processing bottlenecks and problem areas of the baseline algorithm using the MATLAB Profiler. Obtain the profiler information by executing the following script:

profile ony=commaccelerationbaseline(EbNo,1e5);profile off


The Profiler report presents the execution time for each function call of the algorithm in descending order. The first few functions that the Profiler window depicts represent the speed bottleneck of the algorithm. In this case, two user-defined functions (localmimodecoder and localmimochannel) and the vitdec function (Viterbi decoder function of the System Toolbox) are identified as the major speed bottlenecks.

Improve the Simulation Time using System objects

The function commaccelerationsystemobjects implements the second version of the algorithm, which uses System objects from the Communications System Toolbox. Among ten function calls found in the baseline version of this algorithm, nine calls are replaced by corresponding calls to available System objects. The resulting commaccelerationsystemobjects function is composed of two distinct parts:

  • Declaration, which creates the System objects

  • Execution, which calls the step method for each System object in order to perform specific operations.


type commaccelerationsystemobjects


function ber = commaccelerationsystemobjects(EbNo, maxNumBits)% COMMACCELERATIONSYSTEMOBJECTS Simulate a comms link with System objects.%   BER = COMMACCELERATIONSYSTEMOBJECTS(EBNO, MAXNUMBITS) returns the bit error rate%   (BER) of a communications link that includes convolutional coding,%   interleaving, PSK modulation, an Alamouti space-time block code, and a MIMO%   channel with AWGN.  EBNO is the Eb/No of the AWGN channel in dB, and%   MAXNUMBITS is the number of bits simulated before the simulation stops.  The%   simulation uses System objects to implement the above operations.%   Copyright 2011 The MathWorks, Inc.%#codegencoder.extrinsic('rng');rng('default');FRM = 2000;                                      % Input Bit FrameM=4;                                             % Modulation Orderk=log2(M);                                       % Bits per SymbolcodeRate=1/2;                                    % Coding RateadjSNR=EbNo-10*log10(1/codeRate)+10*log10(k);tblen = 32;trellis = poly2trellis(7, [171 133]);LEN=4012;permvec=[1:3:LEN 2:3:LEN 3:3:LEN]';%% Setup Algorithm Parameterspersistent hConvEncoder hViterbi hInterleaver hDeinterleaverpersistent hModulator hDeModulator hOSTBCComb hOSTBCEnc hAWGNif isempty(hConvEncoder)    hConvEncoder=comm.ConvolutionalEncoder(trellis,...        'TerminationMethod','Terminated');        hModulator = comm.PSKModulator('ModulationOrder',4, ...        'SymbolMapping','gray', ...        'PhaseOffset',0, ...        'BitInput',true);    hAWGN=comm.AWGNChannel('NoiseMethod','Variance',...        'VarianceSource','Input port');    hDeModulator = comm.PSKDemodulator('ModulationOrder',4, ...        'SymbolMapping','gray', ...        'PhaseOffset',0, ...        'BitOutput',true, ...        'DecisionMethod','Hard decision');    hViterbi=comm.ViterbiDecoder(trellis,...        'InputFormat','Hard',...        'TracebackDepth',tblen,...        'OutputDataType', 'logical',...        'TerminationMethod','Terminated');    hOSTBCEnc = comm.OSTBCEncoder('NumTransmitAntennas', 2);    hOSTBCComb = comm.OSTBCCombiner('NumTransmitAntennas',2,'NumReceiveAntennas',2);    hInterleaver=   comm.BlockInterleaver('PermutationVector',permvec);    hDeinterleaver= comm.BlockDeinterleaver('PermutationVector',permvec);end%% Run SimulationtotErrs = 0; totBits = 0;while (totBits < maxNumBits)    data      = randn(FRM, 1)>0;    u1        = step(hConvEncoder, data);        % Convolutional Encoder    u2        = step(hInterleaver,u1);           % Interleaver    u3        = step(hModulator, u2);            % QPSK Modulator    u5        = step(hOSTBCEnc,u3);              % Alamouti Space-Time Block Encoder    [u6,~,u7] = localmimochannel(u5,2);          % 2x2 MIMO Channel    sigDB     = 10*log10(var(u6(:)));              noisevar  = real(10.^(0.1*(sigDB-adjSNR)));    u8        = step(hAWGN,u6,noisevar);         % AWGN channel    u9        = step(hOSTBCComb,u8,u7);          % Alamouti Space-Time Block Decoder    uA        = step(hDeModulator, u9);          % QPSK Demodulator    uB        = step(hDeinterleaver,uA);         % Deinterleaver    uC        = step(hViterbi,uB);               % Viterbi Decoder    bitErrs   = sum(abs(uC(1:FRM)-data));    totErrs   = totErrs + bitErrs;    totBits   = totBits + FRM;endber = totErrs/totBits;


Measure the simulation time for this version of the algorithm. Record the elapsed time for running this function in the same for-loop as before.

N=2;str='Using System objects';commaccelerationsystemobjects(EbNo,1e4);berSystemobject=zeros(size(berBaseline));fprintf(1,'Processing the System object version of the algorithm.\n');tic;for EbNo=0:MaxSNRdB    y=commaccelerationsystemobjects(EbNo,MaxNumBits);    berSystemobject(EbNo+1)=y;endb=toc;


Processing the System object version of the algorithm.


The result shows the simulation time of the System object version of the algorithm. Note that by replacing calls to the functions with calls to System objects, the algorithm runs faster. This behavior is expected because one of the advantages of using System objects is efficiency. Algorithms using System objects separate declaration from execution, which avoids the repeated input validation and verification routines found in function-based algorithms. The algorithm implementation that uses System objects performs parameter handling and initializations only once, outside the loop, and exploits efficient MEX implementations of the System objects, which improves overall simulation performance.

commaccelerationreportresults(N,a,b,str);


----------------------------------------------------------------------------------------------Versions of the Transceiver                         | Elapsed Time (sec)| Acceleration Ratio1. Baseline                                         |           48.1736 |       1.00002. Using System objects                             |           12.8705 |       3.7429----------------------------------------------------------------------------------------------


Accelerate the Simulation with MATLAB to C Code Generation

MATLAB Coder generates portable and readable C code from algorithms that are part of the MATLAB code generation subset. The function commaccelerationsystemobjects, the second version of the algorithm, uses System objects that support code generation with MATLAB Coder. Therefore, this algorithm can be compiled and linked into MATLAB as a MEX (MATLAB executable) function. Use the MATLAB Coder codegen command to compile the System objects version of the algorithm into a MEX function (called commaccelerationsystemobjects_mex).

Note: You must have a MATLAB Coder license to run this section of the example.

codegen('commaccelerationsystemobjects.m','-args',{EbNo,MaxNumBits})


Measure the simulation time for the MEX version of the algorithm. Record the elapsed time for running this function in the same for-loop as before.

N=3;str='MATLAB to C code generation';commaccelerationsystemobjects_mex(EbNo,1e4);berCodegen=zeros(size(berBaseline));fprintf(1,'Processing the MEX function of the second version of the algorithm.\n');tic;for EbNo=0:MaxSNRdB    y=commaccelerationsystemobjects_mex(EbNo,MaxNumBits);    berCodegen(EbNo+1)=y;endc=toc;


Processing the MEX function of the second version of the algorithm.


The result shows the simulation time of the MEX version of the algorithm. Note that by compiling the System object algorithm into a MEX function, the MEX version of the algorithm runs faster than both the second and the baseline versions of the algorithm. This behavior is expected because one of the advantages of using MATLAB to C code generation is simulation acceleration. Although the algorithm that uses System objects is highly optimized, code generation can accelerate simulation by locking-down the sizes and data-types of variables inside the function. This process makes the execution more efficient because it removes the overhead of the interpreted language that checks for size and data-type in every line of the code.

commaccelerationreportresults(N,a,c,str);


----------------------------------------------------------------------------------------------Versions of the Transceiver                         | Elapsed Time (sec)| Acceleration Ratio1. Baseline                                         |           48.1736 |       1.00002. Using System objects                             |           12.8705 |       3.74293. MATLAB to C code generation                      |            5.2268 |       9.2167----------------------------------------------------------------------------------------------


Achieve Even Faster Simulation using Parallel Processing Runs

Utilize multiple cores to increase simulation acceleration by running tasks in parallel. Use parallel processing runs (parfor loops) in MATLAB to perform the work on the number of available workers. Parallel Computing Toolbox enables you to run different iterations of the simulation in parallel. Use the parpool function to reserve a number of MATLAB workers for executing a subsequent parfor-loop. In this example, two workers run locally on a MATLAB client machine.

Note: You must have a Parallel Computing Toolbox license to run this section of the example.

if isempty(gcp('nocreate')), parpool; end


Starting parallel pool (parpool) using the 'local' profile ... connected to 8 workers.


Measure the simulation time for the MEX version of the algorithm that executes within a parfor-loop rather than a for-loop used in the previous cases.

N=4;str='Parallel simulation runs with parfor';commaccelerationsystemobjects_mex(EbNo,1e4);berPct=zeros(size(berBaseline));fprintf(1,'Processing the MEX function of the second version of the algorithm within a parfor-loop.\n');tic;parfor EbNo=0:MaxSNRdB    y=commaccelerationsystemobjects_mex(EbNo,MaxNumBits);    berPct(EbNo+1)=y;endd=toc;


Processing the MEX function of the second version of the algorithm within a parfor-loop.


The result shows the simulation time of the MEX version of the second algorithm executing within a parfor-loop. Note that by running this version within a parfor-loop we get the fastest simulation performance. The basic concept of a parfor-loop is the same as the standard MATLAB for-loop. The difference is that parfor divides the loop iterations into groups so that each worker executes some portion of the total number of iterations. Because several MATLAB workers can be computing concurrently on the same loop, a parfor-loop provides significantly better performance than its analogous for-loop.

commaccelerationreportresults(N,a,d,str);


----------------------------------------------------------------------------------------------Versions of the Transceiver                         | Elapsed Time (sec)| Acceleration Ratio1. Baseline                                         |           48.1736 |       1.00002. Using System objects                             |           12.8705 |       3.74293. MATLAB to C code generation                      |            5.2268 |       9.21674. Parallel simulation runs with parfor             |            2.8600 |      16.8440----------------------------------------------------------------------------------------------


Summary

The combined effects of

  • System objects,

  • MATLAB to C code generation and

  • Parallel processing runs


can significantly speed up simulations of your communications algorithms.

  • System objects help improve simulation speed by performing parameter handling and initializations only once outside the execution loop, by exploiting efficient MEX implementations of the System objects, and by avoiding repeated input validation and verification routines found in function-based algorithms.


  • MATLAB to C code generation accelerates simulation by locking-down data-types and sizes of every variable and by reducing the overhead of the interpreted language that checks for the size and data-type of variables in every line of the code.


  • Parallel processing runs can substantially accelerate simulation by computing different iterations of your algorithm concurrently on a multitude of available MATLAB workers.


Further Exploration

In this example we use the function parpool to reserve a number of MATLAB workers that run locally on our MATLAB client machine. By modifying the parallel configurations, you can accelerate the simulation even further by running the algorithm on a larger cluster of workers that are not on your MATLAB client machine. For a description of how to manage and use parallel configurations, see Programming with User Configurations page of Parallel Computing Toolbox User's Guide.

Note on Numerical Performance

In this example, the baseline and System object versions of the algorithm are not numerically identical. The user-defined function localmimodecoder used in the baseline algorithm implements a slightly different version of the MIMO space-time block decoding operation than the one used in the comm.OSTBCCombiner System object of the second version of the algorithm. Nevertheless, when you run each of the first three versions of the algorithm (the baseline version, the System object version, and the MEX function of the second version) with a sufficiently large number of input bits, you will get very similar BER curves indicating similar numerical performance for different versions of the algorithm. Here you find the BER curves obtained by running each of the three versions of the algorithm with the number of input bits set to ten millions (i.e. MaxNumBits=1e7).


Appendix

The following functions are used in this example.

Selected References
  • S. M. Alamouti, "A simple transmit diversity technique for wireless communications," IEEE® Journal on Selected Areas in Communications, vol. 16, no. 8, pp. 1451-1458, Oct. 1998.

  • V. Tarokh, H. Jafarkhami, and A. R. Calderbank, "Space-time block codes from orthogonal designs," IEEE Transactions on Information Theory, vol. 45, no. 5, pp. 1456-1467, Jul. 1999.




二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Generation MATLAB课程 ration MATLAB matla MATLAB课程 代码示例 CCodeGeneration


https://www.cda.cn/?seo-luntan
高薪就业·数据科学人才·16年教育品牌
沙发
我有我的love 在职认证  发表于 2016-3-16 14:05:31 |只看作者 |坛友微信交流群
书山有路勤为径,学海无涯苦作舟!!

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-11-6 07:43