Home

VS DSP USER'S MANUAL

1. Err LDC 0x400 mr0 fractional amp saturation mode LDC input 1 0 point to the newest sample LDC Oxa00f il modulo 1 addressing could be linear 1 LDC coef i2 LDC ES post increment by 1 addressing LDC output i4 pointer to output buffer AND a NULL a LDX 10 b1 LDY 12 b0 Rev 2 6 Page 68 March 8 2001 USER S MANUAL VS_DSP LDC LOOP MUL fFirloop MAC STX endfir end 8 SOFTWARE EXAMPLES clear a reg load first sample coef pair 15 1s loop count number of loops minus one use otherwise unused ls register ls firloop start looping b1 b0 LDX 10 b1 LDY 12 b0 perform first multiply load next pair bl b0 a LDX 10 b1 LDY 12 b0 al 14 store result 8 2 Double Precision FIR Transversal Filter use pipelined MAC to implement FI This code implements an double precision single sample FIR filter The input and the filter coefficients are 32 bits wide the intermediate results 64 bits Algorithm A x 216 B x C x 2 D AC x 2 AD x 2 BC x 2 BD In this example AC is first added to a reg then BD to b reg and after that BC to a1 b0 and finally AD to al b0 fract 15 sect data_x X input input hi uword 0x9234 0 Data x6666 0x7654 xffff 0x5656 Data uword 0x5678 0x4444 0x9f01 zero 14 output output_hi zero 16 coef c
2. STX Opl Op2 Op3 Opl gt X Op2 Op3 Flags X Load or store a register from or to X memory This instruction uses two index registers to generate a long 2 xdataaddress memory address Op3 is always In where Op2 is In Coding parallel move Rev 2 6 Page 46 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE 31 17 16 109 65 0 arithmetic opcode 0010100 RRRRRR Opl rrr Op2 s store load Table 6 2 Loop count register coding 00008 0013 80 51 000la 0010a COC OOlla 0 MR PRT LOOP Start a hardware loop delayed LOOP Opl addr Opl lt LC addr lt LE PC 2 LS Flags L 0 This optional instruction starts a hardware loop The instruction carries a register num ber as encoded in Table 6 2 Loop count and an absolute loop end address which can be calculated by the assembler The LE indicates the address of the last instruc tion within the loop body The loop start is implicitly the second instruction from the LOOP instruction See section 5 2 for details Note the one delay slot associated to this instruction Coding 3 28 27 24 23 20 19 16 15 12 11 8 1 7 43 0 0010 01nninnanniannninnnaninnnnIinndr irrrr rrrrr Opl loop count nn nn absolute loop end address d don t care bit Rev 2 6 Page 47 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE LS
3. USER S MANUAL VS_DSP 4 PROGRAM CONTROL 4 2 Programming Model Program control unit has the following registers 15 o TU O a a a a z 3 3 a 7 0 o Re Go o o o o Z n Optional loop registers tn o LD Kak tn o H ri a Ln o H Q Page registers Ln o IPRO a tn o IPR1 The width of registers is Program address width pa Above it is assumed that pa is 16 4 2 1 PC PC is the program counter It is not directly accessible by the programmer PC is loaded with the program address bus 1 value on all cycles except when new loop round starts In this case PC is loaded with LS In reset PC is copied to LR1 Rev 2 6 Page 21 March 8 2001 USER S MANUAL VS_DSP 4 PROGRAM CONTROL In instruction fetches program address bus is driven either from PC LRO LR1 decoded instruction jump target address reset vector address or interrupt vector address 4 2 2 LRO LRO is used in indirect jumps JRcc instruction causes instruction to be fetched from LRO address instead of PC address if condition cc is true LRO can be used to form subroutines by saving the return address to LRO and executing JRcc at the end of the subroutine If nested subroutines are needed LRO must be saved and restored by the calling subroutine 4 2 3 LRI LR1 is used in interrupt returns RETI instruction causes instruction to be fetched from LR1 address instead of PC address PC is copie
4. 2 a EE kas JS as Decode at e e pap gt Cy C ri un Cy rt ri un Figure 5 3 Single instruction loop start Rev 2 6 Page 26 March 8 2001 USER S MANUAL VS_DSP 5 CONTROL FLOW Fetch Decode __ f K B is pa nep LC Figure 5 4 Single instruction single round loop K ST Fetch Decode 1e 2 eT re 15 15 re ue uel as jisd Figure 5 5 Loop end when LC 0 5 2 Loops Optional Loop mechanism is optional Loop mechanism has three registers which are loop start register LS loop end register LE and loop count register LC Change of flow instructions can not be at loop end address or immediately before that LOOP instruction starts a hardware loop LOOP instruction has one delay slot i e loop start address is LOOP 2 This results from the fact that instruction at LOOP 1 delay slot is fetched before loop registers are updated by LOOP instruction Fig 5 2 and Fig 5 3 illustrate start of loop Loop can also be initiated by setting LS LE and LC to appropriate values When program fetch address equals LE the value of LC is checked If LC is not equal to zero it is decremented by 1 and PC is loaded with LS If LC is equal to zero noth ing special happens and the loop ends Fig 5 5 and Fig 5 6 illustrate these loop end situations Rev 2 6 Page 27 March 8 2001 USER S MANUAL VS_DSP 5 CONTROL FLOW ESC AL O CI S OE E H Decode TS 151
5. 6 2 11 Rounding R If the rounding mode bit is set the 2n g bit ALU operations will round the result to n bits according to the selected rounding mode The rounding mode is by default chosen by the parameter roundmode or by an extension specific mode bit to be defined By default there is just one rounding mode available in the particular implementation of the core Rounding precedes saturation if both are enabled The possible rounding modes are truncate normal round convergent round to 0 and convergent round to 1 Rounding will clear the n bits of the lower half of the result and adjust the upper half according to the hard wired rounding mode In truncation the upper half is left intact In normal rounding the uppermost bit of the lower half is added to the upper part thus rounding up if the lower part was at least half of the LSB value Rev 2 6 Page 37 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE of the upper part In convergent rounding the LSB of the upper half also has effect on the rounding operation in case the lower half is exactly half of the LSB 0b1000 In convergent 0 the upper half LSB is to be added to the upper half in the special case thus adding 1 if the LSB is one In convergent to 1 the complement of the upper half LSB is to be added to the upper half in the special case thus inserting 1 if the LSB is zero The rounded result will be written in one of the accumulators 2n
6. gt bl LDY i2 c0 z 2 gt b0 a21 gt GU LDY 12 cl1 PIIL gt c0 b21 gt cl z 2 z 1 result after this biquad to a reg 8 4 Single Precision Matrix Multiply C A x B matrix multiplication matrix dimensions A 5 4 B 4 3 C 5 3 Note to test with integers use mode 0x600 instead of 0x400 and store a0 or the whole a reg instead of al fract 15 Matrices define RA 5 define CA 4 define RB CA define CB 3 define RC RA define CC CB sect data_x X matrixA uword 1 2 3 4 uword 5 6 7 8 uword 9 1 2 3 uword 4 5 6 7 uword 8 9 1 2 sect data_y Y matrixB Rev 2 6 dimensions Data Data Page 72 March 8 2001 USER S MANUAL VS_DSP R SOFTWARE EXAMPLES uword 12 13 14 uword 15 16 17 uword 18 19 20 uword 21 22 23 matrixC zero 15 sect code Matrix_Multiply mult LDC 0x400 mr0 saturation amp fractional mode LDC matrixA i0 LDC 1 il LDC matrixB i2 LDC CB i3 LDC matrixC i4 LDC CA 1 c0 loop counter for one output value LDC RC do loop counter for rows nextrow LDC CC al loop counter for columns nextcolumn AND a NULL a LDX 10 b1 LDY 12 b0 out 0 gt a LOOP cO inloop MUL b0 b1 LDX 10 b1 LDY 12 b0 inloop MAC b0 b1l a LDX 10 b1 LDY 12 b0 out A i k B k j LDC CA 2 il modify addre
7. 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 4 ALU operand encoding 0000 AO radis y BO DO 0 0000 0000 reserved EA POSEE EE EEEF reserved CTECO D1 DO D2 Table 7 5 Mul operand 0 Gal Ol Gl Ol al Ol H Rev 2 6 Page 61 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 6 Mul mode Table 7 7 Single operand ALU instructions reserved Table 7 6 Mul mode gives the encoding of the mode field The result sub field encoding is shown in Table 7 3 Table 7 4 ALU operand gives the encoding of Op2 of the ALU field alu op2 The single operand opcode encoding is given in Table 7 7 Rev 2 6 Page 62 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING 7 5 Move Encoding The move instructions are LDX LDY STX and STY the X and Y denoting the desired data bus to be used There can be a maximum of two moves loads or stores in parallel one operating on the X bus and the other on Y bus Constant loading is described separately in section 7 7 There are two kinds of moves full moves and short moves The short moves use a restricted set of registers and restricted addressing modes The full moves have all registers and all addressing modes available The parallel mov
8. L 0 RETI Delayed return from interrupt RETI Opl LR1 PC Opl IPRO Flags L 0 The RETI instruction is used for returns from interrupts similarly as JRcc is used for returns from subroutines For description of interrupt mechanism and the correct use of RETI see chapter 5 Coding 28 27 24 23 20 19 16 15 12 11 O010 0001 0dddjddddjddddjddddjddddidddd 28 27 24 23 20 19 16 15 12 11 0010 0001 1ddd ddd d eee es g ddd don t care bits nnn absolute address rrr Opl 10 I7 Rev 2 6 Page 52 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE E Store a register in X memory STX Opl Op2 Opl X Op2 update Op2 Flags x See LDX for the general load store capability description and the encoding of the move fields SEY Store a register in Y memory STY Opl Op2 Opl Y Op2 update Op2 Flags Y See LDX for the general load store capability description and the encoding of the move fields SUB Subtraction of two operands SUB Op1 Op2 An Opl Op2 gt A Flags Z N V E C The operand coding is shown in Table 7 4 ALU operand and the result coding in Table 7 3 Coding 31 28 27 24 23 20 19 17 16 0 0110 RRRR rrrr aaa parallel move RRRR Opl rrrr Op2 AAA target register SUBE Subtraction of two operands with carry SUBC Op1 Op2 An Opl Op2 C An Flags Z N V E C The operand coding is shown in Table 7 4 ALU operand and the resul
9. Rev 2 6 Page 19 March 8 2001 USER S MANUAL VS_DSP 4 PROGRAM CONTROL jump address Fig 4 2 shows the overall structure of the Instruction Address Generator Connections from registers to data buses are not shown The fetch address is determined as follows e On Interrupt cycle 2 interrupt vector IO is the fetch address e If instruction in execute phase is Jcc and the condition is true jump address is the fetch address If instruction in execute phase is JRcc and the condition is true LRO contains the fetch address If instruction in execute phase is RETI LR1 contains the fetch address In all other cases PC holds the fetch address Instruction Address Generator contains the optional loop hardware Behavior of Instruc tion Address Generator is further described in Chapter 5 To achieve larger than pa bit instruction address space two page registers are used IPRO holds the uppermost part of the program address IPRO and PC together deter mine the program address 4 1 3 Interrupt Control Interrupt Controller processes interrupts It implements the interrupt state machine de scribed in Fig 5 7 section 5 3 Interrupt Controller receives external interrupt and drives interrupt fetch signal to Instruction Address Generator Interrupt Controller makes sure that previous interrupt has been processed before new interrupt request is presented to Instruction Address Generator Rev 2 6 Page 20 March 8 2001
10. controlled directly by the core control This kind of instruction can also use a different set of data registers The custom logic can be e g a bit manipulation unit or a divider e Memory Mapped Co Processors In addition to the tightly coupled custom logic above the co operating device can be also memory mapped This kind of exten sion does not necessarily require a custom instruction it can also be controlled simply by storing to and loading from the specific address The co processor can be almost anything e g a hardware filter section some iterative device interface logic or a DMA coprocessor The main issue in this kind of extension is that it does not need to or cannot be controlled by the core software anymore We strongly recommend to contact the core vendor before committing to physical design of custom extensions 1 1 4 Instruction Set The basic instruction set is common for the different instantiations of VS_DSP core such that upward compatibility is retained when switching from a simpler version to a more versatile one This includes all the data word lengths optional hardware and modes and instructions added by the parameters and also custom extensions The assembly code can be directly reused sometimes even the compiled binary code How ever the performance may only be improved by changing the critical parts of the code to use the more advanced features In another direction downward the stripped op
11. Codes oboaaaaaaaa E roootadaddade en T 0010axxxyyv RESP x op2y opl T 01nnnnnnnnnn LOOP loop end Isb loop end msb register loop count AAA 00nn Jcc n address msb address lsb condition 1001n CALLcc n address msb address lsb sme Sa ear te 101 address msb address lsb rk S MAG Tmn 11000000000 reserved 1111111111 for move encoding RESP is a special instruction to restore the P register The rest of the control codes are reserved for future extensions Rev 2 6 Page 59 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 3 ALU result coding n bit register 2n g bit register 7 4 Arithmetic Operands The operands of two operand arithmetic and logic instructions ADD SUB AND OR XOR are encoded in the second field of these instructions The field is composed as follows 27 24 23 20 19 17 In MAC 27 25 24 23 22 20 19 17 Table 7 4 ALU operand gives the encoding of Opl and Op2 of the ALU fields alu opl amp alu op2 S denotes sign extension Table 7 5 Mul operand gives the encoding of fields mac op and mac op2 The opcode of single operand arithmetic and logic instructions ABS LSR and MUL is encoded in the first operand field The encoding is 27 24 23 20 19 17 In MUL 27 25 24 23 22 20 19 17 Rev 2 6 Page 60 March 8
12. Loading of LR1 must override automatic LR1 load by interrupt mechanism to guarantee proper interrupt ending case when instruction 1 above is LR1 load 5 3 2 Interrupt Routines A typical interrupt jump table looks like the following org 0x20 Rev 2 6 Page 30 March 8 2001 USER S MANUAL VS_DSP 5 CONTROL FLOW Ra aoe 3 3 3 7475 T Fach noeh RO RI RD RS Decode wo RON FT Execute 2 RO RO RTC Pe RO RR Re Figure 5 10 Reset JMPI int_routine0 SP 1 JMP int_routinel SP 1 JMPI int_routine2 SP 1 Here the JMPI instructions also increase the stack pointer The start of the interrupt handler must save the processor state before enabling interrupts in the external arbiter The end of the handler restores the processor state Depending Whether only 16 bit or both 16 and 32 bit instruction memory addressing will be used in the program a different kind of a saving and restoring is used The following is a typical 16 bit interrupt routine InterruptService STX mrl 16 STY 17 16 1 STX lr1 i6 STY 1r0 16 1 STX 10 i6 STY il i6 actual interrupt functionality LDX 16 i0 LDY i6 1 i1 LDC NT_GLOB_EN i7 LDX 16 1r1 LDY 16 1 1r0 LDX 16 mr0 RETI STX 17 17 LDY 16 1 17 When an interrupt is taken the interrupt controller automatically disables a
13. Logical XOR of Opl and Op2 Logical Not of Op2 one s complement Op2 Result Op2 Result Op2 Result Op2 Result Op2 Result Arithmetic 1 bit shift right Logical 1 bit shift right Logical 1 bit shift right with carry in bit Logical 1 bit shift left Logical 1 bit shift left with carry in bit Miscellaneous ABS Op2 Result RESP Opl Op2 Absolute value of Op2 conditional negate Restore pipeline register from Op1 and Op2 The ALU can calculate either 2n g or n bit operations The selection of operation width is made depending on the operands if one of the operands is 2n g bits wide the operation is 2n g bits and the result is stored to a 2n g bit register If both operands are n bits the operation and result are also n bits and the result is stored to a n bit register The n bit operands are AO A1 BO D0 D1 The pseudo registers NULL and ONES are also available and contain all zeros and all ones respectively NULL and ONES are considered to be n bit registers for the purpose of determining the result width see above The 2n g bit operands are A B C and D P is available as operand2 The register A Rev 2 6 Page 11 March 8 2001 USER S MANUAL VS_DSP 2 DATAPATH is formed by concatenating A2 A1 A0 AO is the lsb part For 2n g bit calculations also n bit registers are available as operands In this case the register is used as the middle part of the operand The Isb end i
14. OR logical OR A 2x Areg Ares K O E RETI retum from mem C Pc E El Ele STX store on X bus Ireg vidi isos ll Pak store on Y bus Ireg pe O UN SUB subtract Ja 2xAreg Areg u u x x x x x e R ol aid d a d logical XOR 2 x Areg Areg x x 0kx 0 Types A arithmetic logic C control M data move O optional Operands and result reg register I index mod modifier addr address Mode bits and flags x sets flag u uses bit 0 sets flag to 0 Delay slots db delay slot before da delay slot after Rev 2 6 Page 39 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE 6 4 Instruction Descriptions The instruction description includes the mnemonicandaone line description name of the command the syntax and mathematical expression of the instruction comments on the use and other specific information and finally the coding of the instruction The operand fields or other further refinements are given in accompanying tables The number of registers is dependent on the core parame ters Several instructions can be executed in parallel when they are using different fields of the instruction word e g ALU operations and two parallel moves with indirect addressing are possible see instruction composition in chapter 7 Rev 2 6 Page 40 March 8 2001 USER S MANUAL V
15. coding in Table 7 3 Coding 28 27 24 23 2019 1716 0 RRRR Opl rrrr Op2 AAA target register ASR Arithmetic shift right ASR Op2 An for each i gt 0 Op2 i gt Anli 1 Op2 msb A msb Flags Z N V E C op2 0 The instruction shifts right by one position The LSB bit is discarded and MSB of the source registers 1s fed into the MSB bit of the result 28 27 24 23 20 19 17 16 0 1111 0001 rrrzr aaal parallel move rrrr Op2 AAA target register Coding CALLcc Conditional delayed jump and save return address CALL addr PC LRO if cond addr PC Flags L 0 Identical to normal jump instruction but PC is saved to LRO This instruction replaces the sequence J addr LDC 1 LRO which is used in subroutine calls Note the one delay slot associated to this instruction The address which is saved to LRO is the CALL instruction address 2 The instruction in the delay slot is always executed regardless of the condition Coding 28 27 24 23 20 19 16 15 12 11 nn nn absolute address cccccc condition Rev 2 6 Page 42 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE HALT Halt the processor and wait for an interrupt HALT Flags no change The processor is halted to a low power state Normal execution is resumed when an interrupt occurs Coding 28 27 24 23 20 19 16 15 12 11 OO0O10J1LO1liddddjddddiddddjddddiddddjdddd dd don t care J Delayed jump to absol
16. d0 exp a gt exp b Rev 2 6 Page 74 March 8 2001 USER S MANUAL VS_DSP 8 SOFTWARE EXAMPLES S check the difference in exponents save loop hw status SUB d0 d1 d1 STX lc 16 1 STX ls 16 STY le 16 JGE 2 a is much bigger than b return a AND b0 NULL b0 zero lsp shift a b right 1 times to avoid overflow in add later loop shifts b 1 extra times shift b until it has the sam xponent LOOP d0 3 SUB a0 ONES dO make result have exp a 1 S3 ASR b b shift a 1 time restore loop hw AND a0 NULL a0 LDY i6 le zero lsp ASR a a LDX i6 1 1s a b now have the sam xp J f norm_res ADD ar bre LDX i6 1 lc do the add veturn a 2 J f norm ADD a NULL c a b gt c f_sub J f_add calculate a b SUB NULL b1 b1 negate bl Subroutines called by f_add f_sub and f_mul f_norm_res d0 exp cl c0O mantissa norm c gt c f_norm_res ADD c NULL c test mantissa for zero NOP Rev 2 6 Page 75 March 8 2001 USER S MANUAL VS_DSP JR AND SL exit S2 f_norm end Rev 2 6 8 SOFTWARE EXAMPLES JZC 1 result is not zero NOP c NULL cO force exp to zero ADD c1 c1 dl shift left for xor XOR else 04 NOP JNS 2 normalized exit ADD d0 ONES dO J 1 ADD CLeLe
17. e MRO load can set or clear the loop flag e JR RETI J CALL and LOOP instructions clear the loop flag e JMPI does not affect the loop flag Rev 2 6 Page 35 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE 6 2 2 Index X X The flag is set or cleared based on the address ALU output updated address If address ALU X uses modulo addressing the flag is set if the modulo restoring to buffer is done the calculated linear address is outside the buffer If restoring is not done the linear address is inside the buffer the flag is cleared If address ALU X uses other addressing modes linear or bit reversal the flag is set if the uppermost bit is set If the bit is clear the flag is cleared 6 2 3 Index Y Y Same as Index X above but uses address ALU Y instead 6 2 4 Zero Z If the ALU is operating in the 2n g bit mode and bits 2n g 1 0 of the ALU result are all clear the flag is set If the ALU is operating in the n bit mode and bits n 1 0 of the ALU result are all clear the flag is set Otherwise the flag is cleared 6 2 5 Negative N If the ALU is operating in the 2n g bit mode and bit 2n g 1 of the ALU result is set the flag is set If the ALU is operating in the n bit mode and bit n 1 of the ALU result is set the flag is set Otherwise the flag is cleared 6 2 6 Overflow V Set if an arithmetic overflow occurs in the ALU result 6 2 7 Extens
18. example is 8 fract 15 define BIQUADS 8 sect data_x XData aly delay Line uword 0x1100 0x120 input uword 0x1234 output zero 1 sect data_y YData z 2 s uword 0x11 0x22 0x33 0x44 0x55 0x66 0x77 0x88 BIQUADS coef coefficients al 0 0x13 dly_1 delay line z uword 0x111 0x222 uword 0x555 0x666 coef_1 coeffic 0x333 0x777 ients ty Dil dl uword 0x100 0x200 0x300 0x400 2 BIQUADS uword 0x500 0x600 0x700 0x800 00 0x1400 uword 0x1500 0x1600 0x1700 0x1800 11T 2 0x444 BIQUADS 0x888 azl BOTs ad dl uword 0x2100 0x2200 0x2300 0x2400 2 BIQUADS uword 0x2500 0x2600 0x2700 0x2800 uword 0x3100 0x3200 0x3300 0x3400 uword 0x3500 0x3600 0x3700 0x3800 sect code Biquad_ITR nas LDC 0x400 mr0 LDC input 10 AND a0 NULL a0 LDX 10 al input gt a LDC dly i0 LDC coef i2 LDC is Rev 2 6 Page 71 March 8 2001 USER S MANUAL VS_DSP DC OOP DE DX UL Klik Et EE MAC MAC MAC biquadloop ADD STX lirend end BIQUADS 1 1s ls biguadloop output i4 12 b0 LDY 10 b1 b0 b1 LDX 10 b0 b0 c0 a LDX i2 c0 cO bl a SIX bl 10 Cl bOy aso STY al apa al i4 store output r i0 1 z 1 8 SOFTWARE EXAMPLES all gt b0 z 1
19. full move 00000a 000 Ola nns ROBT MR1 E reserved IPR Table 7 10 Load Store coding load Table 7 11 Addressing Modes indirect In with post modify by pppp 7 7 rrr1000 indirect In with post modification specified in In Rev 2 6 Page 65 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 12 Modifications by the In register In In m m positive In reserved In In 2 m optional In In 2 m optional In In m optional In In m optional In In m bit reverse optional In In m m negative sign is automatically in the three MSB bits of In In the basic version only the m modifications are implemented 7 7 Constant Loading The additional fields in the constant load instruction LDC look like 2l 65 0 immediate register The immediates are assumed signed and will be sign extended if the register is wider than the immediate In case there are more bits in the immediate than in the register to be loaded the LSB part is taken The register number is encoded as in the full addressing load stores shown in Table 7 9 Rev 2 6 Page 66 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 13 Addressing mode summary full move code Li
20. has a corresponding register pair designated In In is the index register the number of which is generated by inverting the LSB bit of the number of register In For example if 13 is In then I2 is In To form X Y addresses In is used as the address In then specifies the post modification address mode if any To form long X addresses 2 xdataaddress In and In are con catenated to form the long address Two index registers can be read for X Y addresses and X Y Address ALU index in puts designated In in Fig 3 1 Two index registers can be read for X Y Address ALU modifier inputs designated In in Fig 3 1 The In is the register pair of In In registers used for X Y Address ALU index inputs can be updated with address ALU outputs 3 1 2 Address ALU Address ALU can calculate three types of updated addresses which are linear post inc dec modulo post inc dec and bit reversal Address ALU contains linear and bit reverse adders for calculating linear and bit re versed addresses These adders are otherwise identical but in bit reverse adder carry propagates towards LSB Modulo logic is capable of restoring calculated linear addresses to remain within a buffer if modulo addressing is used The buffer length does not need to be a power of two The modulo and bit reverse addressing modes are separately enabled by the Addressing mode mask parameter of the core see section 7 6 It is also possible to extend the addressing modes by a
21. manuals describe the associated software development tools and development boards 1 1 1 Functional Units The basic VS_DSP architecture is shown in Fig 1 1 The DSP core components are described in detail in Chapters 2 4 This section gives an overview of the blocks shown in the architecture diagram The following units comprise the DSP core e Datapath an arithmetic logic unit ALU and a multiplier Optionally a barrel shifter a bit manipulation unit or other special computational units can be added to the core e Data Address Calculation Two dedicated address calculation units provide addresses to data memory accesses They enable two operands to be fetched from the data memory in parallel e Program Control The program control fetches the instruction generates the Rev 2 6 Page 1 March 8 2001 USER S MANUAL VS_DSP 1 INTRODUCTION Interrupt arbitrator VS DSP CORE DATAPATH ADDRESS CALCULATION PROGRAM CONTROL ALU ALU X address Y address L Peripheral interface X and Y X memory memory yY memory PLL clock Peripheral devices Figure 1 1 VS DSP General Architecture Bus switch Boot loader Program memory next program address and decodes the previously fetched instruction The control may include optional hardware for zero overhead loop control e Buses The blocks are communicating over b
22. may be connected to interrupt lines e Interrupt Arbitrator The core has a single interrupt line but multiple interrupt priorities and interrupt nesting are supported by an external interrupt arbitration block The hardware supports vectored interrupts e Clock Generator The operating clocks of the core can be best provided by an on chip phase locked loop based clock generator The control registers of the clock generator can be memory mapped enabling the core software to control the clock frequency at runtime This is important especially in systems requiring low power operation e Boot Loader A specific piece of program ROM containing boot up code for loading software from a host computer or an external non volatile memory 1 1 2 Parameters The most distinguishing feature of VS_DSP core is the use of parameters to tailor the actual implementation of the basic architecture There are currently two basic imple mentations of the VS_DSP core the full custom version FC and the synthesizable VHDL version VHDL See Table 1 1 for the parameters their ranges and their values for the FC and VHDL core implementations The parameters can be used to optimize the performance power consumption and both core and system chip area Especially the data word length has a major impact on the system ASIC ASSP area because of its direct relation to the area occupied by data memories The parameter values are set in a hardware configurati
23. mod i for each i AW 4 0 mod i AW 3 Tni Tali Rev 2 6 Page 15 March 8 2001 USER S MANUAL VS_DSP 3 DATA ADDRESS GENERATOR Updated value for In is In In mod Note that in the case of a negative modifier In should contain the desired modifier m in two s complement format Example 0x61 0x1f 0x42 using In In In Ox61 97 In updated In 0x42 66 3 2 2 Modulo Post increment decrement Optional In modulo addressing calculated addresses are kept within a buffer whose length is M The lower boundary of the buffer must be an integer multiple of 2 where 2 gt M To use modulo addressing In must be within the buffer 1 e AW k MSB bits of In must equal the corresponding bits of the lower boundary AW 3 LSB bits of In should contain the value M 1 AW means the data address width To calculate updated address the following steps are taken 1 Determine the lower boundary of the buffer Starting from In AW 3 find the uppermost 1 bit in Tn Let k be the bit position of the uppermost 1 bit The lower boundary is for each i AW k 1 lower i In 1 for each i k 0 lower i 0 2 Determine the upper boundary of the buffer Upper boundary is lower boundary plus the length of the buffer which is contained in In The upper 3 MSBs of In are not part of the buffer length length In AW 3 0 1 upper lower length 1 3 Calculate linear address Calcul
24. shift left first adjust c0 by 1 JR SUB d0 ONES cO adjust back ADD c0 NULL do J f norm_res AND c0 NULL CU Page 76 March 8 2001
25. the normal execution flow the instruction immediately before the jump has been executed The jump condition is determined again this time with different flags Rev 2 6 Page 56 March 8 2001 USER S MANUAL VS_DSP Chapter 7 Instruction Coding 7 1 General Instruction Composition The instruction is composed of a 4 bit opcode and additional fields as described below 31 28 27 65 0 O0oOO iiiiiiiiiiiiiiiijiiiiji YYYYYY opcode immediate target 31 28 27 0 OODOO CCCCCCCCcCCCCCCCCCCCCCCCCCCCcCCCC opcode control code 31 28 27 14 13 0 OO00 XXXXXXXXXXXXXX YYVYVYVYVYYYYVYYVYVYYV opcode X full move Y full move 31 28 27 17 16 0 ooooaaaaaaa a aa mmmmmmmmmmmmmmmm opcode arithmetic operands parallel moves 7 2 Opcode Field The encoding of operations is shown in Table 7 1 The control and double move exten sions to the opcode are described in the following section Rev 2 6 Page 57 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 1 Operation Codes Parallel A DC 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 Single op instructions 7 3 Control Code The absolute address in jump instructions is at most 20 bits The conditional jumps Jcc are taken when the condition given in the instruction is true See Table 6 1 Jump condition for the condition field coding The flag an
26. 15 rer 169 E pe EE aus ee 37 Figure 5 6 Loop end when LC 0 5 3 Interrupts Interrupts are vectored using a jump table The external interrupt arbiter supplies an interrupt vector The vector is an address in the range 0x20 0x3f These addresses hold a jump table with JMPI instructions which jump to the start of the appropriate interrupt routine In interrupts LR1 is used to save the return address When main program is interrupted return address is automatically copied to LR1 Interrupts end with a RETI ora JRcc If nested interrupts are needed interrupt program must save LR1 and restore it before returning from interrupt Saving and restoring instructions must be the routines specified in section 5 3 2 5 3 1 Interrupt Mechanism Fig 5 7 shows interrupt state machine with 5 interrupt cycles State transitions are done at the end of each instruction cycle The actions in each state are described in the right hand side of that state On interrupt cycle 1 fetch address is copied to LR1 Instruction fetch is done from the normal fetch address On interrupt cycle 2 first interrupt instruction 10 is fetched IPRO is copied to IPR1 On this cycle we also decide whether instruction fetched on interrupt cycle 1 will be canceled or not The L flag is set in MRO register Instruction fetched on interrupt cycle 1 must be canceled unless e instruction fetched on the cycle before interrupt c
27. 20 19 16 15 1211 90100 000 1aaajaaaalaaaajaaar rreclocec ddd don t care bits cccccc condition rrr Op 10 I7 LDC Load constant to a register LDC constant Opl constant Opl Flags no change The register Op1 coding is shown in Table 7 9 Target full move The assembler understands numbers in different bases e g hexadecimal decimal binary while the immediate is finally coded in binary format A single constant load can be done in an instruction and no parallel arithmetic can be used The constant is LSB aligned and sign extended if needed Coding 31 28 27 24 23 20 19 16 15 12 11 87 43 0 RRRRRR Opl ii ii constant immediate Rev 2 6 Page 45 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE LDX Load register from X memory LDX Opl Op2 X Op1 Op2 update Opl Flags x EB Load register from Y memory LDY Op1 Op2 Y Op1 Op2 update Opl Flags Y Coding double full moves 31 28 27 14 13 0 Coding parallel full move 31 28 27 24 23 2019 1716 12 11 87 43 0 0000 opcode allowing parallel moves dddd don t care b bus X Y 0 1 FFFFF full move bits of X Y Coding parallel short moves 3 28 27 24 23 20 19 1716 12 11 8 1 7 43 0 xxxx short move bitsof X y y y y short move bits of Y LDX Load register from X memory with long address LDX Op2 Op3 Op1 X Op2 Op3 Opl Flags X STX Store register in X memory with long address
28. ALU result in arithmetic operations is saturated to the maxi mum positive or negative value in case the operation creates an over underflowing result The integer fractional mode controls the shifter after the multiplier to output the result in a correctly aligned format The rounding mode is implementation dependent Arithmetic flags are evaluated after an arithmetic operation 2 4 ALU The functions of the ALU and the multiplier are listed below Rev 2 6 Page 10 March 8 2001 USER S MANUAL VS_DSP 2 DATAPATH Multiplying adding and subtracting ADD Opl Op2 Result ADDC Opl Op2 Result S S UB Opl Op2 Result Op1 Op2 Result M Op2 Op2 Result Opl Op2 Result Opl Op2 Result Add operands Add operands with carry in bit Subtract operands order can be chosen Subtract operands with borrow in order can be chosen Multiply operands multiplier MAC operation ALU and multiplier MSU operation ALU and multiplier Special operations with add sub UB NULL Op2 Result Op1 ONES Result Op1 ONES Result Opl NULL Result NULL NULL Result NULL ONES Result NULL ONES Result Negate Op2 two s complement Increment Op1 Decrement Opl Pass Opl unchanged Clear result register to zero Set result register to 1 Set result register to 1 AND Opl Op2 Result OR Opl Op2 Result XOR Opl Op2 Result NOT Op2 Result Logical AND of Opl and Op2 Logical OR of Opl and Op2
29. Architecture Overview Program control unit pcu performs instruction fetch and decode control flow changes and interrupt fetching In addition to the program counter PC program control unit has two link registers which are used for indirect jumps LRO and LR1 Program Memory cael request Program Data Program Address core Instruction Instruction interrupt Interrupt Decode Address Controller Generator Control signals registers Figure 4 1 Program Control overview Rev 2 6 Page 18 March 8 2001 USER S MANUAL VS_DSP 4 PROGRAM CONTROL interrupt address Program Address Figure 4 2 Instruction Address Generator overview Mode register MRO holds the mode and flag bits and MR1 is used as a temporary mode register while transferring to interrupt service Optional loop control has three registers LS LE and LC Program counter is not directly accessible Program Control unit has three components which are shown in Fig 4 1 The com ponents Instruction Decode Instruction Address Generator and Interrupt Controller are described in the following subsections 4 1 1 Instruction Decode Instruction Decode reads instructions from Instruction Data Bus and decodes them 4 1 2 Instruction Address Generator Instruction Address Generator contains all pcu registers Instruction Address Generator drives Instruction Address Bus from PC LRO LR1 interrupt address or from instruction
30. I Logical shift left LSL Op2 An for eachi lt bits 1 Op2 i gt Anli 1 0 gt A 0 Flags Z N V E C op2 bits 1 The instruction shifts left by one position This instruction is implemented in hardware as ADD Op2 Op2 A Note P is not available as an operand for this instruction Coding 28 27 24 23 20 19 17 16 0 0100 parallel move rrrr Op2 AAA target register LSLC Logical shift left with carry LSLC Op2 A for eachi lt bits 1 Op2 i gt Anli 1 C gt A 0 Flags Z N V E C 0p2 bits 1 The instruction shifts left by one position This instruction is implemented in hardware as ADDC Op2 Op2 An Note P is not available as an operand for this instruction Coding 28 27 24 23 20 19 1716 0 1000 parallel move rrrr Op2 AAA target register LSR Logical shift right LSR Op2 An for eachi gt 0 Op2 i gt Anli 1 0 gt A msb Flags Z N V E C op2 0 The instruction shifts right by one position The LSB bit is discarded and zero is fed into the MSB bit The operand Op2 is encoded as described in Table 7 4 ALU operand and the result coding in Table 7 3 Coding 28 27 24 23 20 19 1716 0 1111 0010 errrlana parallel move rrrr Op2 AAA target register This instruction is implemented as a single instruction software macro Rev 2 6 Page 48 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE LSRC Logical s
31. In this sense also LOOP is considered as a change of flow instruction in addition to J Jcc JRec CALLcc and RETI The JMPI instruction is also a change of flow instruction and has the same kind of timing behavior as other change of flow instructions but the instruction in the delay slot is canceled executed as NOP and can therefore be a change of flow instruction This feature is mostly used in the interrupt vector table 5 1 Jumps Jump conditions are the processor flags and their combinations The flags that are used in the jump condition evaluation must be unaffected in the cycle before the jump in struction is executed i e the instruction immediately before the jump instruction must not change the jump condition flags Other flags can be modified Fig 5 1 shows the situation where instruction 2 is a change of flow instruction J Jcc JRcc CALLcc or RETI Instruction 3 is in the delay slot and is always exe cuted When jump instruction executes cycle 4 program address is driven either from jump target register LRO or LR1 jump is taken or from PC jump is not taken D2 denotes this address PC is loaded with D2 1 on the next cycle Rev 2 6 Page 25 March 8 2001 USER S MANUAL VS_DSP 5 CONTROL FLOW AE a A 7 87971 fah A we e D A oOo SS Decode as o pa 7 F tafefefeje Figure 5 1 Jump execution fee AA Fetch o ee ce eRe EE ee A lt LE EE Figure 5 2 Loop start
32. L Op1 Op2 Opl x Op2 P Flags no change Performs one multiplication The operands can be signed or unsigned multiplica tion mode and possible saturation are controlled by the appropriate mode bits There are different mnemonics for different format operands The data format can be Opl signed Op2 signed MULSS Opl unsigned Op2 signed MULUS Op1 signed Op2 un signed MULSU or Op unsigned Op2 unsigned MULUU The format SS is the default and MULSS can thus be written as plain MUL Coding 28 27 24 23 20 19 17 16 0 1111 parallel move rrr Opl RRR 0p2 mm data format MVX MVY Register to register move MVX Op1 0p2 Opl gt Op2 Flags no change Moves a register to another register using X or Y data bus In parallel MVX any register can be used as a source or target The source is read on X bus switched to Y bus and written from Y bus In double MVX MVY two moves can be performed with a single instruction The source and destination registers must be from different execution units ALU DAG PCU Coding parallel move 17 16 12 11 Coding double move 31 28 27 24 23 18 17 12 11 65 0 o010 1011 SSSSSS DDDDI D ssssss ladddda n reserved ssssss Y source dddddd Y tar get SSSSSS X source DDDDDD X target Rev 2 6 Page 50 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE NOP No operation NOP noef fect Flags no change A
33. R a o ne re dela 23 42 6 LS optional gt ses s sae ar s ta ta 23 A EPR O a GSS SE a H de a O a da o 23 A TBR I aa TE E RAS DAA A Sig ds 23 4 220 E toplional lt a aa Re e a ti det te ed 23 4 210 DE Optional s moro et E Seek RR ed a Seek e 24 5 Control Flow 25 dl JUMPS cn ge Sees ad ats pd dj i i Gee k d 25 5 2 Loops Optional 4a a cae bea oe al tee e ir 27 5 3 Interrupts Sesen a da ee S ad A as dE rad 28 5 3 1 Interrupt Mechanism e di Id a A a y de 28 Rev 2 6 Page iv March 8 2001 USER S MANUAL VS_DSP 5 3 2 Interrupt Routines 5 4 System Reset virus b koks dk k j RES BOLE Mss ld ai ai vai i R K ss de Ses He OH 6 Instruction Set Reference 6 1 Programming Model ll aaa 6 2 Flags and Mode Bits ll iii 62 hoops 2 8 x mess Sese c amp Se 4 6 2 2 Index X X S vk ee RE 62 5 ANS VJ AA a r ti get 4 6 24 Zero Ze da e 4 5 st r a kura Red at aat 625 Negative N fue ge Boe o Krika 6 2 6 Overflow V 6 2 7 Extension EB os ici 02 0 Garry C e 2 cc ac m BA a ee BAS oe x 6 2 9 Saturation S 2 00 6 2 10 Mite Mie ser ei sasaites nar ate e s Ges des 6 2 11 Rounding R 6 3 Listof Instructions 6 4 Instruction Descriptions 6 5 Instruction Sequence Restrictions D a 6 5 1 Loop Register Restrictions 6 5 2 Conditional Jump Restrictions 7 Instruction Coding 7 1 Gener
34. S_DSP 6 INSTRUCTION SET REFERENCE AB S Absolute value ABS Op2 An Op2 gt A Flags Z N V E C The operand is conditionally negated two s complement operation and placed in the target register The coding of Op2 is given in Table 7 4 ALU operand and the result coding in Table 7 3 The absolute value of the minimum integer fraction 1 0 is the maximum integer in the saturation mode Coding 28 27 24 23 20 19 17 16 0 1111 0000 rrrr ana parallel move rrrr Op2 AAA target register ADD Addition of two operands ADD Op1 Op2 An Opl Op2 gt A Flags Z N V E C The operand coding is shown in Table 7 4 ALU operand and the result coding in Table 7 3 LSL is constructed with ADD Opl Opl Ay Coding 28 27 24 23 20 19 17 16 0 0100 RRRR rrrr anA parallel move RRRR Opl rrrr Op2 AAA target register ADDC Addition of two operands with carry ADDC Op1 Op2 An Opl Op2 C An Flags Z N V E C The operand coding is shown in Table 7 4 ALU operand and the result coding in Table 7 3 Coding 28 27 24 23 20 19 1716 0 1000 RRRR rrrr anA parallel move RRRR Opl rrrr Op2 AAA target register Rev 2 6 Page 41 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE AND Bitwise AND of two operands AND Op1 Op2 An for each i Op1li Op2 i gt Anli Flags Z N V 0 E C 0 The operand coding is found in Table 7 4 ALU operand and the result
35. VS_DSP USERS MANUAL Revision 2 6 March 8 2001 Revision history Rev 2 6 March 8 2001 Added long X decoding Rev 2 5 January 5 2001 Core parameter table and explanation updated Rev 2 4 October 16 2000 Minor corrections to the instruction coding chapter Rev 2 3 October 10 2000 Minor corrections to L flag references Rev 2 2 May 25 2000 L flag added Rev 2 1 October 4 1999 Instruction coding corrected Rev 2 0 July 9 1999 Revisioned for VS_DSP2 Rev 1 2 December 8 1998 Typos corrected Rev 1 1 September 7 1998 ASHL LSHL removed Rev 1 0 April 27 1998 First release USER S MANUAL VS_DSP 1998 2000 VLSI Solution Oy Hermiankatu 6 8 C FIN 33720 Tampere Finland Information furnished by VLSI Solution Oy is believed to be accurate and reliable However no responsibility is assumed by VLSI Solution Oy for its use Specifications are subject to change without notice All rights reserved No part of this manual may be reproduced in any form or by any means without permission in writing from the copyright owner The descriptions contained herein do not imply the granting of license to make use or sell equipment constructed in accordance therewith Rev 2 6 Page ii March 8 2001 USER S MANUAL VS_DSP Contents 1 Introduction 1 IL OVERVIEW ua he a ae A ee Be a 1 1 1 1 Functional Units 4 es sms ar rt Ge ti ra ee 1 A cr gt lt ada aol 8 es wie Bo ze nie eee Se ee ae 3 I e T EXTENSIONS x
36. VX or STX instruction IPRO can be changed by JRec or JMP I instruction 4 2 8 IPR1 IPR1 is the interrupt register of IPRO In interrupts IPRO is copied to IPR1 at inter rupt cycle 2 There are limitations on the use of IPR1 It can be accessed only as a source operand in MVX or STX instruction There is no way to write to IPR1 except the interrupt mechanism 4 29 LE optional LE holds the loop end address LOOP instruction loads LE with loop end address speci fied in the LOOP instruction When instruction fetch occurs from LE address and L flag is not set new loop round starts if LC 0 LE is initiated with all ones in system reset Rev 2 6 Page 23 March 8 2001 USER S MANUAL VS_DSP 4 PROGRAM CONTROL 4 2 10 LC optional LC holds the loop count LOOP instruction loads LC from specified register When instruction fetch occurs from LE address LC is tested for being equal to 0 If LE 0 it is decremented by one new loop round starts and LS is copied to PC If LC 0 nothing special happens and PC is loaded with instruction fetch address 1 as usual Rev 2 6 Page 24 March 8 2001 USER S MANUAL VS_DSP Chapter 5 Control Flow The control flow behavior follows the three stage pipelining of the processor operation The change of flow instructions are all delayed with one delay slot following the in struction There can not be another change of flow instruction in the delay slot
37. al Instruction Composition ie 1 Opcod Field 9 32 s ar on ot ee oS ae a as x KI Control Code 22 4 3 2 La Eesti do hep tp be 13 Rev 2 6 Page v CONTENTS March 8 2001 USER S MANUAL VS_DSP Rev 2 6 CONTENTS kt Arnthmetic Operands er ei Sew ka Se Se ee See F dx ais 60 7 Move Encoding lt 224220440 62640 24 ES m ls Geb 63 7 6 Addressing Modes 04 6 46246545 u dr R ketehelees hak 64 Tel Constant Loading x ce 35 5 gate 4 ai Owe a eee de he 66 Software Examples 68 8 1 Single Precision FIR Transversal Filter 68 8 2 Double Precision FIR Transversal Filter 69 8 3 Cascaded Biquad IIR Filter ac s osa ara aa 71 8 4 Single Precision Matrix Multiply 12 8 5 Floating Point Multiplication and Addition 74 Page vi March 8 2001 USER S MANUAL VS_DSP Chapter 1 Introduction 1 1 Overview VS_DSP is a parameterized and extensible DSP core The different manifestations of the core share a common general architecture and instruction set The core can be used to build application specific integrated circuits ASICs and standard products ASSPs The core is available in several CMOS fabrication processes and can be promptly ported to any normal CMOS process line This manual provides introduction to the general architecture parameters and extension attachment The instruction set reference is also included Separate
38. ate linear modified address depending on MSBs of In Rev 2 6 Page 16 March 8 2001 USER S MANUAL VS_DSP 3 DATA ADDRESS GENERATOR THAW AW 4 Restore to buffer if needed If linear address is outside the buffer restore it to buffer If linear address is already inside the buffer nothing needs to be done If lower lt lin lt upper In lin If lin lt lower In lin length If lin gt upper In lin length Example 13 point ring buffer 0x20 0x2c Ox2c 0x2 0x21 In Ox2c 44 Tn Ox4c 64 12 76 1 lower 0x20 2 upper 0x2c 3 linear Ox2e updated In 0x21 33 3 2 3 Bit Reversal Optional In bit reversal addressing calculated addresses are kept within a buffer length 2 and when calculating the updated address carry is propagated towards the LSB The lower boundary of the buffer must be a multiple of 2 To use bit reversal addressing In must be within the buffer i e AW k MSB bits of In must equal the corresponding bits of the lower boundary 3 MSBs of In should contain 110 to select bit reversal addressing LSBs of In should contain the value 2 In In In AW 3 0 propagate carry towards LSB Example 16 point k 4 FFT in buffer 0x50 Ox5f Ox5c 92 Oxc8 192 2 7 200 updated In Rev 2 6 Page 17 March 8 2001 USER S MANUAL VS_DSP Chapter 4 Program control 4 1
39. d mode bits can be masked by the implementation parameter Modemask see Chapter 4 Return JRcc and return from interrupt RETI use the link registers to restore the PC The linking return address storage is done by a constant load instruction to the link register LRO the link register should be saved beforehand in case of a subroutine already being executed The return address is calculated at compilation linking time not run time This allows also jumps by loading the link register and then executing the JRcc instruction The linking can be done also in the delay slot The LR1 loading takes place automatically when interrupt processing is started In the optional loop instruction there is a register number containing the loop count All registers except the double size accumulators can be used The loop end address is given as an immediate at most 20 bits value The loop start address will be loaded au tomatically from the PC The register field encoding is given in Table 6 2 Loop count The loop registers LC LS LE should not be loaded within the two instructions pre ceding a loop end to avoid implementation dependent ambiguities in the loop behavior In the full size moves the load store operations can use all the addressing modes and all registers These moves do not allow any control operations in parallel See section 7 5 Rev 2 6 Page 58 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 2 Control
40. d to LR1 on interrupt cycle 1 and possibly on interrupt cycle 2 see section 5 3 1 for description of interrupt mechanism If nested interrupts are needed LR1 must be saved and restored by the interrupt service routine See section 5 3 2 for the save and restore routines 4 2 4 MRO MRO is the processor mode status register The length of the register is pa so the mini mum length is 11 and maximum 20 bits It includes the mode bits and status flags The bits are here in the case pa 16 15 87 0 mode bits flags 5 saturation mode 1 f integerYfractional 0 mult mode Rev 2 6 Page 22 March 8 2001 USER S MANUAL VS_DSP 4 PROGRAM CONTROL In the end of an interrupt MRO is being restored from the stack Thus explicit moves must override the evaluation of flags The mode bits and flags are described in more detail in section 6 2 4 2 5 MR1 MR1 is the interrupt register of MRO In interrupts MRO is copied to MR1 at interrupt cycle 4 when L flag is set in MRO MR1 must be saved in the start of the interrupt 4 2 6 LS optional LS holds the loop start address LOOP instruction copies instruction fetch address to LS When new loop round starts PC is loaded with LS instead of instruction fetch ad dress 1 4 2 7 IPRO IPRO is the instruction page register It holds the upper pa bits of instruction address There are limitations on the use of IPRO It can be accessed only as a source operand in M
41. ed as an ALU operand in ADD and SUB instruction to form a MAC or MSU operation The data word length n is a parameter and the multiplier word length m is another independent parameter Two data buses width n connect the datapath to off core memories Rev 2 6 Page 8 March 8 2001 USER S MANUAL VS_DSP Rev 2 6 saturation fract int shift interface to interface to X bus Y bus Figure 2 1 VS_DSP datapath Page 9 2 DATAPATH March 8 2001 USER S MANUAL VS_DSP 2 DATAPATH 2 2 Arithmetic The datapath operates by default on signed two s complement numbers The mul tiplier has separate modes for integer and fractional multiply selected by a bit in the mode register see Chapter 4 The multiplier can also operate on unsigned signed signed unsigned and unsigned unsigned operands The type of the operands is declared in the multiply instruction with the signed signed operand pair as the default The logical operations simply consider the operands as bit patterns There is also a saturation mode for the multiplier and ALU selected by a bit in the mode register see Chapter 4 In the saturation mode the result is interpreted as a signed number and saturated accordingly 2 3 Flags and Mode Bits The processor mode status register includes the mode bits and status flags The bits affecting or being affected by the datapath are DR rounding mode In the saturation mode the
42. egister for P because a single multiplication result always fits into 2n bit register The arithmetic registers can be used either as n bit registers mentioned above or as 2n g bit registers A B C D P yu Address registers are the da bit index registers 10 Il 17 Optionally there may also be index registers 18 rro RLS WO Control registers are the program counter PC link registers LRO LR1 and mode regis ters MRO MR1 Optional loop hardware registers are LS LE LC and page registers IPRO IPR1 Rev 2 6 Page 34 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE 6 2 Flags and Mode Bits The mode status register MRO bits are here in the case pa 16 15 87 0 mode bits flags S saturation mode inieger Tfractional mult mode The normal definition of the flags and mode bits is as follows Exceptions to the flag behavior are listed in the particular instruction description 6 2 1 Loop L When set the loop flag disables loop end detection i e loop hardware The flag is automatically set by the interrupt mechanism to prevent false loop end detections when the interrupt causes the execution to transfer to zero page from another page Normally there is no need for the user to set or clear the loop flag The detailed operation of the loop flag is as follows e Interrupt sets the loop flag The value in MR corresponds to L 1
43. erations have to be compensated by replacing the extension instructions by corresponding software macros Special care has to be taken also when decreasing the width of some address or data word length On C language level the compatibility is not an issue since the tools can retarget the code to the correct core version by recompilation VS_DSP has a reasonable level of parallelism The operation is pipelined in three stages fetch decode execute Within a single processor clock cycle the core can Rev 2 6 Page 5 March 8 2001 USER S MANUAL VS_DSP 1 INTRODUCTION Generate the next program address e Fetch an instruction e Decode the previously fetched instruction Perform up to two data moves Post modify up to two data pointers e Perform a computation on register data Chapter 6 details the instruction set 1 2 VS_DSP Development System VS_DSP is supported by a comprehensive set of software and hardware for core evalu ation and application system development The VS_DSP Evaluation Kit consists of the VS_DSP Software Development Toolkit VSKIT and the Development Board VSKIT includes e Assembler The Assembler assembles the source code and data modules and enables e g macros and include files to be used The Assembler adapts to the parameter values given in Configuration Files e Linker The Linker links separately assembled modules e Archiver The Archiver enables a function library to be b
44. es can be done together with arithmetic operations and can either be one full or two short moves Double full move instruction has two full moves but can not be executed in parallel with other instructions The full move field is always the following 14 bit control field 13 109 65 0 Srrr jPPPPIRRRRRR In short moves the move field is as follows 13 109 0 srrr p0 00 000 RRR s store load r address register p post modification mode R move source destination register O Nn In the double full move the 14 bit fields come directly after the instruction 27 14 13 0 srrrppppRRRRRR srrrppppRRRRRR X full move Y full move Parallel move can be either one full move two short moves or one register to register move The coding of parallel moves is 16 1413 0 srrrppppRRRRRR full move b bus 0 X 1 Y 16 15 87 0 lisrrrpRRRisrrrpRRR X short move Y short move Rev 2 6 Page 63 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 8 Registers in short move 00 ao ar 16 14131211 0 001 00 ssssssdddddd reg to reg move Y bus 16 1413 109 0 001 0100 srrrRRRRRR long X move The coding of the store load bit is given in Table 7 10 The rrr register is the number of the desired address register The src dest register number RRR RRR is given in Table 7 9 Source and target and the addressing mode in Table 7 11 See also section 7 6 for furthe
45. g registers If the rounding mode bit is clear the ALU operates normally in the 2n g base 6 3 List of Instructions The following table lists all basic and optional instructions The operand set of each instruction mode bits affecting the operation and the flags affected are also declared Rev 2 6 Page 38 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE Mnemonic meaning ype operands result SI R JL K Y ZN VIE C ob oa o e a A 2xAreg Areg lu u u x x x x x ij KE PAS 00 ao seram a gt aes ee PEE oe oT ase Ponisi alae kee EEE Ele oe e CALLES conditional call TC Taare ERO f u u u u u u u yes yes ALT waittoranimemp C ff imp c far prof pe Je condition jomp C edre o fe u fafafa fa yesyes IMPI jump ignore delay dot C address Eo es m masane o po ps Re condoni IR ee tee _ F 0 ufu uu lau a yesyes LDC Toad constant Mmm es pp 1 DX loadonX bus M res mod ree Pp DY ficadonYbus M res mod ree pre Loor saro CO reg addr Les ps D ET A Are Are Ele Pe x x SLC LSL withearry A Aree Ares Ej ESR Tbig right shit A Ares Ares ojo ESRC ESR wih cary A Jaro Aree Eo MAC mulply accumulate A 2 x Areg Ares Jajaja pap MsU mul piysubract A 2 x Aree Ares Pa jojo ppp MUL G JA fare a LOD es re pp MVY registermove M es ee pp NOP no operation M NOT logical NOT Aree
46. hift right with carry LSRC Op2 An for eachi gt 0 Op2 i gt Anli 1 C gt A msb Flags Z N V E C op2 0 The instruction shifts right by one position The LSB bit is fed to carry and carry is fed into the MSB bit The operand Op2 is encoded as described in Table 7 4 ALU operand and the result coding in Table 7 3 Coding 28 27 24 23 2019 1716 0 thkkt 0 0 11 parallel move rrrr Op2 AAA target register MAC Multiply accumulate MAC Op1 Op2 An An P gt An Opl x Op2 gt P Flags Z N V E C The instruction performs one multiplication and adds the result of the previous multipli cation P to a register The multiplication operands are considered signed or unsigned see MUL multiplication mode and possible saturation are controlled by the appropriate mode bits Coding 28 27 24 23 20 19 17 16 0 0101 parallel move rrr Opl RRR Op2 AAA target register mm data format MSU Multiply subtract MSU Op1 Op2 An A P gt An Op1 x Op2 P Flags Z N V E C The instruction performs one multiplication and subtracts the result of the previous mul tiplication P from a register The multiplication operands are considered signed or unsigned see MUL Coding 28 27 24 23 20 19 17 16 0 0111 parallel move rrr Opl RRR Op2 AAA target register mm data format Rev 2 6 Page 49 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE MUL Multiply MU
47. hmetic register Al is written to either from data buses or from ALU A2 is written to with the sign extension of Al The only exception is when an ALU operates in 2n g bit mode and ALU result is written to A2 A1 A0 In this case the uppermost g bits of ALU result are written to A2 Note that if ALU operates in n bit mode and A1 is the result register the sign extension will be written to A2 If AO is the result register A2 is not written to Rev 2 6 Page 12 March 8 2001 USER S MANUAL VS_DSP Chapter 3 Data Address Generator 3 1 Architecture Overview Data Address Generator performs data address calculations and drives data address buses It contains index registers IO I7 Y data X data index registers 2 7 ans In n n In Address ALU Address ALU X address Y address Figure 3 1 Data Address Generator overview Rev 2 6 Page 13 March 8 2001 USER S MANUAL VS_DSP 3 DATA ADDRESS GENERATOR Data Address Generator contains two identical parallel address ALU units and is capa ble of providing two independent data addresses on each cycle Two address registers used in addressing can be post modified 3 1 1 Index Register File Index register file contains index registers IO 17 Two index registers can be accessed to from X and Y data bus each cycle Index registers are used to form X or Y memory addresses The registers are accessed in pairs Each register designated In
48. ion E If the ALU is operating in the 2n g bit mode and bits 2n g 1 2n 1 are all the same either all ones or all zeros the flag is cleared Rev 2 6 Page 36 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE If the ALU is operating in the n bit mode the flag is cleared Otherwise the flag is set 6 2 8 Carry C If a carry is generated in an addition or a borrow is generated in a subtraction the flag is set The flag is set also in LSR and LSRC if the LSB bit of the operand is logical 1 Otherwise the flag is cleared 6 2 9 Saturation S If the saturation mode bit is set the ALU and multiplier operations will saturate the result in case of an over underflow The overflow flag will be set but its interpretation is that saturation has taken place in the ALU Rounding precedes saturation if both are enabled simultaneously If the mode bit is clear the ALU and multiplier will not saturate their outputs and the overflow flag will have its normal meaning 6 2 10 Integer I If the integer mode bit is set the multiplier result is interpreted as an integer and thus no re alignment is needed Otherwise the multiplier result is assumed to be a fractional number with two leading sign bits which will be re aligned by a single left shift before storing in the P register Normally a zero will be fed into the LSB In saturation to the largest positive value the LSB will be set to one
49. isting instructions can be fine tuned by generating new operation modes which can be chosen by setting or clearing some of the reserved mode bits In a similar manner some new flags and condi tions can be added 1 INTRODUCTION e New Pipeline Register Modifications The multiply accumulate operation is pipelined and it is possible to invent new ways of shifting masking or modifying the pipeline register content when using it as an ALU accumulation operand e New Addressing Modes There remains a reserved code in the addressing mode field More addressing modes can thus be added to the architecture e New Arithmetic Logic Instructions There are reserved operation codes for includ ing new ALU operations using the same register set as the basic core This is the Rev 2 6 Page 4 March 8 2001 USER S MANUAL VS_DSP 1 INTRODUCTION way to add e g barrel shifter support or special bit operations to the instruction set e New Conditional Instructions E g conditional parallel moves can be coded by employing reserved bits in some instructions This enables e g conditional stor age of data or pointer from registers to memory Also ALU operations can be made conditional if necessary e Custom Logic Blocks in Parallel with the Core There are reserved opcodes for the inclusion of almost full length extension instructions A custom logic block co processor can be placed in parallel with the core and the operation is still
50. ll interrupts Writing to the chip specific memory address INT_GLOB_EN enables the interrupts Rev 2 6 Page 31 March 8 2001 USER S MANUAL VS_DSP 5 CONTROL FLOW The interrupts must be disabled during the RETI instruction execution and they will therefore be enabled in its delay slot The RETI will also clear the L flag and the restoring of MRO must therefore come before it if the flag is not cleared by the user The following is a typical 32 bit interrupt routine Both of the interrupt routines pre sented here may change the values of X and Y flags which therefore cannot be used in the actual program in sections where interrupts are enabled STX 17 16 1 STY lr0 16 MVX ipr1 i7 STX Id i6 1 STY i7 16 STX mr1 16 1 STX i0 16 STY il 16 actual interrupt functionality LDX RE Op LDY i6 1 il LDX 16 1 mr0 DE NT_GLOB_EN i7 STY i7 17 LDX 16 lr0 LDY i6 1 i7 JR i7 LDX 16 i7 LDY 16 1 1r0 i7 EEN ipri 1r0 17 and LRO must be restored in the delay slot of the JR instruction because the JR uses them both 5 4 System Reset System reset forces the processor to a known reset state After reset is released the processor starts executing instructions from reset address onwards All registers except LE and PC are zeroed on reset LE is set to all ones PC is set to reset vector Interrupt Controller is forced to in
51. lso the LOOP instruction does not have the restriction so single instruction loops are allowed illegal_example ldc loop_endl le ldx 10 1c le comparison not done nop loop_end1 nop legal_example LAG 2 18 ldc loop_start ls ldc loop_end2 le le comparison is done nop loop_end2 nop 6 5 2 Conditional Jump Restrictions The instruction immediately before the jump instruction JRcc or Jcc must not change the flags that affect the jump condition For example if the jump is a JCC jump if carry clear the instruction immediately before must not change the C flag In practice this means that instruction must not be an ALU instruction X and Y flags can be changed since they do not affect the carry clear condition Rev 2 6 Page 55 March 8 2001 USER S MANUAL VS_DSP examplel ldx i0 1 NULI jcc jump_target nop example2 not a0 al jxs jump_target nop 6 INSTRUCTION SET REFERENCE must jump must Jump not change delay slot not change delay slot C flag wf X flag x7 The reason for this restriction is the fact that the jump condition is determined during the decode phase In a normal linear execution the instruction immediately before the jump does not affect the jump The situation is different if the jump instruction is canceled due to an interrupt When execution returns from the interrupt to
52. mp instruction but ignores the instruction in the delay slot a NOP is executed instead and jumps to zero page Also the index register specified is optionally modified identical to LDX Opl n NULL The X flag is not updated Flags no change regardless of the LDX result This instruction is used in interrupt vector jump table Coding 28 27 24 23 20 19 16 15 nn Rev 2 6 nn absolute address rrr address register dd don t care mm address mode 00 no update 01 1 11 1 Page 44 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE JRCC Conditional delayed jump to the address in link register 0 JRcc if cond LRO PC Flags L 0 JRCC Conditional delayed jump to the address in link register 0 JRcc Opl if cond LRO PC Opl lt IPRO Flags L 0 The JRcc instruction can be used for returns from subroutines as well as for other jumps with run time calculated addresses The return addresses are typically loaded by an LDC instruction Flags and their combinations can be used as jump conditions as shown in Table 6 1 Jump conditions The instruction immediately before the JRcc must not change the flags that are used in the jump condition Other flags can be changed Unconditional return can be done with the always condition Note the one delay slot associated to this instruction Coding 28 27 24 23 20 19 16 15 12 11 0010 0000 0ddd dddd ddd d mas 28 27 24 23
53. n additional mode 3 1 3 Flags MRO is the processor mode status register The bits affected by the address calculation are Rev 2 6 Page 14 March 8 2001 USER S MANUAL VS_DSP 3 DATA ADDRESS GENERATOR Brag index X flag index Y flag Index flags are evaluated whenever an index ALU is enabled If the ALU uses modulo addressing the flag is set if the modulo wrap around is done If the ALU uses addressing mode other than modulo the flag is the sign bit of the index register used In 3 2 Post modification Modes Addressing has two post modification modes specified in the instruction post modification by 7 7 or post modification by In The coding is summarized below indirect In with post modify by pppp 7 7 rrr1000 indirect In with post modification specified in In If Tn is used to specify the post modification mode 3 MSBs of Tn are used to specify the post modification mode as follows i In In m optional In In m optional In In m bit reverse optional In In m m negative 3 2 1 Linear Post increment decrement Linear post inc dec can be an immediate pppp 7 7 modification or modification In In Post modifier mod is either immediate pppp sign extended for each i AW 1 3 mod i pppp 3 for each i 2 0 mod i ppppli or modifier specified by In sign extended for each i AW 1 AW 3
54. near post inc dec In srrr0000RRRRRR srrrORRR xxx 000 IRRRRRK srrrO010RRRRRR NA O COS srrrOO11RRRRRR NA o o NA NA NA NIA Srrr1000RRRRRR Modulo post inc dec srrr1000RRRRRR I srrr1000RRRRRR srl serr1000RRRRAR 01 a NA 0 0 In 2 m srrr1000RRRRRR srrrl Bit reversal In m bit rev srrr1000RRRRRR A RRR S T TE srrrpppp 000RR N A 0 0 0 0 TOOIRRRRRR NA __ more srrrppppo01100 NA gt gt gt Pop srrrppppi00100 NA gt gt fuc____ serrppppooli0i NA gt er Lis erreso NA gt gt est me fsrrrppped t t NA OP est n N 0 7 srrrpppp010RRR 8 1 n 0 Rev 2 6 Page 67 CO NA 7 15 srrrpppp011RRR N A 7 iregs 16 March 8 2001 USER S MANUAL VS_DSP Chapter 8 Software Examples 8 1 Single Precision FIR Transversal Filter This code implements an single precision single sample direct form transverse 16 stage FIR filter The input and the coefficients are 16 bits wide the intermediate results being 32 bits fract 15 Ssect data_x XData delay zero 15 x 15 x 1 delay line at startup input uword 0x1234 x 0 at startup output zero 1 sect data_y YData cost zero 16 sect code Single _precision_FIR
55. oef_hi uword 0x8001 0 zero 14 sect data_y Y input_lo zero 14 output_lo zero 16 Rev 2 6 Page 69 March 8 2001 USER S MANUAL VS_DSP 8 SOFTWARE EXAMPLES coef lo uword Oxffff 0x Oxaea zero 14 sect code Double_precision_FIR ft Double precision single sample FIR LDC 0x200 mr0 LDC input 10 LDC Oxa00f il modulo 1 addressing LDC ll LDC coef i2 LDC IES AND a NULL a intermediate results in a b AND b NULL b set result to zero LDC 45 18 16 stages LOOP ls firloop LDC output i4 Next sample from delay line gt c next coefficients gt d LDX 10 cl LDY 10 c0 LDX 12 dl LDY i2 d0 32x32 bit MAC with 64 bit result MULUU c0 d0 ADD Bypyb MULSS cl d1 ADDC a p a MULUS c0 al ADD NULL p c ADD c0 b1 b1 LDX i0 c0 LDC al MULSS Aled sign extend BC 31 16 ADDC a p a MULSU c0 d0 ADD NULL p c ADD GU Tal MULSS dl cl sign extend AD 31 16 Firioop ADDC a p a result after this stage in a b scale result to 031 and store LSL b b LSLC a a STX al 14 STY a0 14 1 store output endfir Rev 2 6 Page 70 March 8 2001 USER S MANUAL VS_DSP NOP end 8 SOFTWARE EXAMPLES 8 3 Cascaded Biquad IIR Filter This code implements a single sample IIR filter as a cascade of second order biquad sections The number of sections in this
56. on file The software tools can adjust their operation according to the parameter values and the actual hardware is gen erated by using the very same parameters as input for either layout module generators or synthesis scripts 1 1 3 Extensions The basic set of instructions can be extended by adding custom hardware The hardware can be incorporated as a part of the core or as a coprocessor The extension instruction Rev 2 6 Page 3 March 8 2001 USER S MANUAL VS_DSP Table 1 1 VS_DSP Parameters Parameter Parameter symbol VHDL Notes Data word length n 64 bits 16 bits 16 bits D to data registers and buses Hof arithmetic registers AE sep 8 8 _ Registerlengih n e 8 _ Register length da Modifier only biaiy 0 1 Forbids the use of odd registers for addresses Loop hardware 0 1 N Levels of loop hardware la a D REI Addressing mode three bit word Enable bits for mask O 7 modulo bitrev and reserved modes cor pa bits word 0x077f Ox7ff Selects active T RT izietas Roundmode 0 1 2 3 Selects the rounding mode implemented 0 truncate 1 round 2 convergent 0 3 convergent 1 mechanisms allow hardware software trade offs to be made in the application develop ment They also have a major impact on the power consumption by enabling a lower clock frequency to be used There are hooks for the following types of extensions e New Operation Modes The operation of ex
57. parallel move NOP is a load operation to NOP register A total NOP is LDC to NOP Coding 28 27 24 23 20 19 17 16 0 1111 0100 ddddadal parallel move ddd don t care NOT Bitwise logic NOT operation NOT Op2 A for eachi Op2 i gt Anli Flags Z N V 0 E C 0 The operand Op2 coding is shown in Table 7 4 ALU operand the target can be one of the registers In hardware this is equal to an XOR with register ONES Coding 28 27 24 23 20 19 1716 0 1101 1001 parallel move rrrr Op2 AAA target register OR Bitwise logic OR operation OR Opl Op2 An for eachi Opl i Op2 i A i Flags Z N V 0 E C 0 The operands are encoded as described in Table 7 4 ALU operand and the result coding in Table 7 3 The target is one of the registers Coding 28 27 24 23 2019 1716 0 UB parallel move rrrr Opl RRRR Op2 AAA target register Rev 2 6 Page 51 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE RESP Restore P register RESP Op1 Op2 Opl lt PO Op2 gt P1 Flags no change This instruction restores the P contents from two arithmetic registers The saving of the P shall be done as described in section 2 5 The operands are encoded as multiplication operands Coding 28 27 24 23 20 19 16 15 12 11 0010 00 1 0 dRRR r r ra addd ddddjddddidddd rrr Opl RRR Op2 ddd don t care bits RET I Delayed return from interrupt RETI LR1 PC Flags
58. r description of the addressing modes available The post modification pppp is a four bit two s complement number 7 7 which is added to the address register The code 8 is for the additional address post modification modes found in In The In is the index register the number of which is generated by inverting the LSB bit of the number of register In It is recommended to use the odd numbered registers as In and even as In The modifier register set Mn in the basic version aliased to the odd In can be used instead of the In If even In and even Mn are used in the basic version the code will be transferable to versions with an additional dedicated modifier register set The post modifications by the In Mn are defined in Table 7 12 7 6 Addressing Modes The addressing modes and their availability in short and full formats are summarized in Table 7 13 The addressing modes available in the implementation are controlled by the parameter Addressing mode mask which has enable bits for the modulo bit reversal and reserved addressing modes in the following manner served The modulus m is given by the lower end of Tn word length 3 bits in unsigned format such that the third bit from the MSB end of Tn defines whether to add or subtract In the m case the m is a word length 2 bit two s complement number where the Rev 2 6 Page 64 March 8 2001 USER S MANUAL VS_DSP 7 INSTRUCTION CODING Table 7 9 Registers in
59. ress LE D2 is the loop start address LS or next linear address if the loop ends In all cases D2 is copied to LR1 and will be the interrupt s return address If there is the possibility that instruction 3 is a change of flow instruction instruction in its delay slot 4 would not be fetched because of IO fetch To prevent this instruc tion 3 must be canceled Fig 5 9 illustrates this situation LR1 is not updated again on Rev 2 6 Page 29 March 8 2001 5 CONTROL FLOW USER S MANUAL VS_DSP 5 CONTROL FLOW ee a eS er aa moa E ESA A A A A Feich A 9 0 NW HO Decode afele nm oe fel S aleio E UTA AA v j Figure 5 8 Interrupt when instruction 3 is not canceled Geer We Oa ge O oe ce ee O wo l 2 3 l 1 h A 0 Ww pm o Decode a pe wor o mr Fo a e e n pfs Figure 5 9 Interrupt when instruction 3 is canceled interrupt cycle 2 so it will point to address of instruction 3 In case of nested interrupts old value of LR1 is lost when first interrupt instruction 10 is fetched In interrupt routine second instruction 11 saves LR1 so another interrupt fetch can start when I1 has been executed cycle 8 in Fig 5 9 Next IO can be fetched on cycle 8 at the earliest to be sure LR1 is saved properly In interrupts I0 I1 and D are always executed before another interrupt can occur If I2 is a change of flow instruction also D is executed
60. s padded with n zeros and the sign is extended to the optional guard bits For example if register AO is used as 2n g bit operand the operand is xx A0 0000 xx means sign extension bits The result of 2n g bit operation is either A B C or D The result of n bit operation is AO Al BO D0 D1 The ALU optionally produces negative carry overflow zero and extension guard bits in use flags 2 5 Multiplier The multiplier is a n x n signed unsigned integer fractional saturating unsaturating mul tiplier Both inputs can be interpreted either as signed or unsigned numbers to facilitate multi precision operations The integer fractional mode bit controls the 1 bit left shift of the result fractional mode when it is written to P In fractional signed x signed multiplica tion saturation is optionally in saturation mode included so that result of 0x8000 x 0x8000 is 0x7fffffff NOT 0x7ffffffe The P register length is 2n bits The P register can be saved by executing ADD NULL P An The high and low parts will reside in the high and low parts of the target accumulator respectively The restoring will take place by executing the RESP instruction 2 6 Guard bit registers Optional guard bit registers behave as an extension of registers A1 B1 C1 and D1 The following describes A2 B2 C2 and D2 function similarily but they refer to B1 C1 and D1 respectively instead of Al Whenever an arit
61. s the datapath in detail Chapter 3 explains the addressing modes and data address calculation unit func tionality Chapter 4 describes the program control unit Chapter 5 illustrates the control flow of the core Chapter 6 is the instruction set reference with the programming model flags and mode bits and a detailed description of each instruction in the basic instruction set Chapter 7 describes the instruction coding field by field in different instruction types Chapter 8 gives assembly language software examples Page 7 March 8 2001 USER S MANUAL VS_DSP Chapter 2 Datapath 2 1 Overview The VS_DSP datapath architecture is depicted in Fig 2 1 The datapath operates with the principle of one cycle per instruction from register to register The 2n g bit ALU implements the arithmetic ABS ADD ADDC SUB SUBC MAC MSU and logic AND ASR LSL LSLC LSR LSRC NOT OR XOR instructions MUL is implemented by the separate multiplier The ALU has up to eight n bit arithmetic registers AO A1 BO D0 D1 Optional guard bit registers A2 D2 are available These can be combined to form 2n g bit accumulators A B C and D Note that C and D are optional The multiplier is a m x m bit signed unsigned integer fractional saturating unsaturating multiplier Multiplier inputs are AO Al BO DO D1 Multiplier output goes to a 2n bit pipeline register P which can be us
62. sses befor LDC 1 CB CA 2 13 the next round next column LDX 12 NULL STY al 14 1 store C i j ADD d1 ONES d1 LDX 10 NULL LDC 1 il restore modifiers JZC nextcolumn LDC CBs LDC CA il modify addresses befor LDC SCBz13 the next round next row ADD d0 ONES d0 LDX i0 NULL LDX 12 NULL LDC trad restore modifiers JZC nextrow LDC CB i3 endmult Rev 2 6 Page 73 March 8 2001 USER S MANUAL VS_DSP 8 SOFTWARE EXAMPLES end 8 5 Floating Point Multiplication and Addition Single precision i e a0 exponent 16 bits signed al mantissa 1 15 format Q15 from 1 0 to 0 9999999 9 f_mul multiplies a and b and puts result in c f_add is the addition routine c a b and f_sub is the subtraction c a b fract 15 Maximum difference in exponents If the difference is greater no calculation is done and larger number is returned define _F_MAX_EXP_DIFF 16 Stack pointer index register define SP i6 sect code Floating_point Fractional mode must be set saturation mode must be unset e g LDC 0x0000 mr0 a b gt c f mul MULSS al bl ADD NULL p c truncate mode J f_norm_res ADD a0 b0 d0 a b gt e f_add SUB a0 b0 d0 LDX 16 1 NULL make room to stack LDC F_ MAX _EXP_DIFF dl JGE 51 exp a gt exp b ADD a NULL c swap a b ADD b NULL a ADD C NULL b SUB a0 b0
63. t coding in Table 7 3 Coding Rev 2 6 Page 53 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE 28 27 24 23 20 19 17 16 0 1001 RRRR rrrrfanA parallel move RRRR Opl rrrr Op2 AAA target register XOR Bitwise logic XOR operation XOR Opl Op2 A for eachi Opl i G Op2 i gt Anli Flags Z N V 0 E C 0 The operand coding of Opl and Op2 is shown in Table 7 4 ALU operand and the result coding in Table 7 3 XOR has also been used to implement NOT Coding 28 27 24 23 20 19 1716 0 CrON RRRR rrrrfanA parallel move RRRR Opl rrrr Op2 AAA target register Rev 2 6 Page 54 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE 6 5 Instruction Sequence Restrictions There are certain sequences of instructions which due to the pipelined execution would produce undetermined results These sequences are either flagged as errors by the soft ware tools or masked off by the hardware 6 5 1 Loop Register Restrictions When either the LE LC or LS register is loaded from memory with a LDX or LDY instruction the loop end comparison is not done This means that loop registers can not be loaded by instruction whose address is LE 2 If this is done further loop rounds are ignored and the execution continues linearly The LDC instruction does not have this restriction and the loop hardware uses the value loaded with an LDC if it is needed on the same cycle A
64. terrupt cycle 3 Fig 5 10 shows reset behavior RO R3 denote addresses reset vector reset vector 3 Rev 2 6 March 8 2001 Page 32 USER S MANUAL VS_DSP 5 CONTROL FLOW FS IS HE E E P568 IE 87 Ferch Decode Figure 5 11 HALT execution 5 5 Halt In HALT the processor input clock is held low until an interrupt occurs The execution pipeline is stopped When an interrupt occurs the processor will execute 3 instructions after HALT instruc tion before executing the first interrupt instruction See figure 5 11 In the figure the execution of the HALT instruction takes 3 cycles cycles 2 4 The interrupt request is received during cycle 4 If the interrupt state machine is not in the idle state when HALT goes to execution HALT instruction has no effect and is executed like a NOP Rev 2 6 Page 33 March 8 2001 USER S MANUAL VS_DSP Chapter 6 Instruction Set Reference 6 1 Programming Model The processor programming model is shown in Fig 6 1 The processor contains arith metic address and control registers LC IPR1 da da D 16 I LE IPRO C CO 14 I ES SS ll Q B I I LR1 R1 A AO 10 Il LRO RO va N N N N 5 rll G l eal G l 0 ER ER E Figure 6 1 Processor programming model Arithmetic registers are the n bit registers AO A1 Bl Dlandthe g bit guard bit registers A2 D2 The multiplier pipeline register PO P1 is also shown There is no guard bit r
65. uilt by the user e Configuration Files The Configuration Files describe the system There is a configuration file to declare the parameter values of the core and another file for allocating memory and mapping peripherals to the memory space e Instruction Set Simulator The Instruction Set Simulator ISS reads lod or coff format object files generated by the Linker and performs an interactive instruction level simulation The ISS uses the Configuration Files to create a correct model of the core and its surroundings The features include disassembly breakpoints memory and register watch profiling dumping and undumping of the state save and resume file i o and generation of test vectors to be used for hardware veri fication e Emulator User Interface The Emulator User Interface looks like the ISS but it connects to the Development Board for program execution instead of using the simulator engine e C Compiler The C Compiler reads ANSI C based source code interleaved with some optimization constructs and produces VS_DSP code ready to be assembled All software included in the VSKIT is documented in a separate manual called VS_DSP Software Tools User s Manual For further information please refer to that manual Rev 2 6 Page 6 March 8 2001 USER S MANUAL VS_DSP 1 INTRODUCTION 1 3 Organization of This Manual The rest of this manual is organized as follows Rev 2 6 Chapter 2 describe
66. uk o Bag WH de we a Bae dao 3 I P L STS e Set rss eee ra t ds aaa ae eg 5 1 2 VS_DSP Development System aa 6 1 3 Organization of This Manual ll S a 7 2 Datapath 8 A1 ONECARE K PAS ay ATS dS A S 8 2d Tilih iGss xi r a u K ea e ms aa di 10 2 3 Flags and Mode Bits ces ie Skat SG le SS eee Se ore Bee S 10 S ALU sr i ee RA d d ae des de a 10 Zv M ultipliets lt 2 ae tas saita esi as A a ei sm 12 205 Guard DILT RIST TS 3 48 epi Ee RR ca S Bae Boas at xa 4S 12 3 Data Address Generator 13 3 1 Architecture Overview Sos K ragi Ges KR 4 So Ko i RD AS here 13 ZLE gt Inbex Repister Fil rs Vag iaci s rs R E R a i 14 3 1 2 Address ALU a sea nee 44068 Red Rod a a 14 Rev 2 6 Page iii March 8 2001 USER S MANUAL VS_DSP CONTENTS SA lags ace ee eh ye ee Vee eee Ss eee NTE 14 3 2 Post modification Modes e e e 15 3 2 1 Linear Post increment decrement 15 3 2 2 Modulo Post increment decrement Optional 16 3 2 3 Bit Reversal Optional 17 4 Program control 18 4 1 Architecture Overview Su 18 Seed 4 a 18 4 1 1 Instruction Decode voces ia ee be ae ea 19 4 1 2 Instruction Address Generator 19 4 1 3 Interrupt Control 4 oaa 840 8 20 4 2 Programming Model rio o a e e 21 AR BES DE A A ASA Giang 21 dd PAO A AA a a Lac 22 AO a ads a A s DS sg A g as LS 3 22 ARA MBA A a oP ae pd ie 22 A5 EM
67. uses There are two data buses X and Y with the corresponding address buses XAB and YAB not shown in the figure An instruction bus I and a corresponding address bus IAB are used for code memory accesses All the buses are available also for off core use The following units may be attached to the core when building system ASICs or ASSPs but they are not considered to be a part of the core Rev 2 6 e Memory RAM and ROM or any memory available in the particular fabrication process can be placed on chip The amount of on chip memory can be tailored to the application the only practical limits being economical and technical limi tations of the fabrication process the resulting die size Off chip memory does not have any implications from the technology used On the other hand off chip accesses typically go through a bus switch Page 2 March 8 2001 USER S MANUAL VS_DSP 1 INTRODUCTION e External Bus Switch Off chip accesses can be multiplexed to a single address and data bus to save in the pin count of the package The use of flexible wait states enables the use of external memories with different timing characteristics e Peripherals Serial and parallel interface ports timers and also analog interfaces analog to digital and or digital to analog converters may be attached subject to technology limitations analog precision available etc The peripherals are mapped to the data memory space of the core and
68. ute address J addr addr PC Flags L 0 Note the one delay slot associated with this instruction Coding 28 27 24 23 20 19 16 15 12 11 nn nn absolute address EE Conditional delayed jump to absolute address Jcc addr if cond addr PC else PC 1 PC Flags L 0 Flags and their combinations can be used as jump conditions as shown in Table 6 1 Jump conditions The instruction immediately before the Jcc must not change the flags that are used in the jump condition Other flags can be changed Note the one delay slot associated to this instruction Coding 28 27 24 23 20 19 16 15 12 11 nn nn absolute address cccccc condition This instruction is implemented as a single instruction software macro Rev 2 6 Page 43 March 8 2001 USER S MANUAL VS_DSP 6 INSTRUCTION SET REFERENCE Table 6 1 Jump conditions IE f 000001 00001 00 ha O O o O o o oo Oo YS index Y set o o O Oo o o o oo NOV SI 1001 less than or equal to zero N V S Z 1 o o RC extension clear 0 Oo ojojojo o o o y l CO o o o y N oO Or rO O K N f N 1 N 41 O 0 greater than or equal to zero N V S O N W S Z 0 JMP I Jump ignore delay slot increment index register JMPI addr Op1 n addr PC Opl n Opl 0 IPRO Identical to normal ju
69. ycle 1 is a change of flow in struction or e instruction fetch on interrupt cycle 1 occurs from loop end address LE Rev 2 6 Page 28 March 8 2001 USER S MANUAL VS_DSP Interrupt pending LR1 fetch address Tne eles Fetch normally if instr in Decode is not cancelled Int Cycle 2 RDA Fetch from I0 Set L flag Acknowledge interrupt Int Cycle 3 MRO MR1 Int Cycle 4 Int Cycle 5 Figure 5 7 Interrupt cycle control flow If instruction fetched on interrupt cycle 1 is not canceled on interrupt cycle 2 LR1 is loaded with the destination address of instruction that is in execute stage on interrupt cycle 2 Destination address is the address that would be fetched if the interrupt fetch would not have occurred e g jump target address or next linear address During interrupt cycle 4 MRO is copied into MR1 From interrupts point of view change of flow instructions are J Jcc JRcc RETI and LOOP Fig 5 8 shows an example when canceling occurs and Fig 5 9 when it does not occur In figure Fig 5 8 instruction 2 is a change of flow instruction or instruction 3 is fetched from loop end address LE Instruction 3 must be executed before interrupt is serviced If instruction 2 is a change of flow instruction destination address denoted by D2 is the jump target address or next linear address if the jump is not taken If instruction 3 is fetched from loop end add

VS DSP USER'S MANUAL

Contents

Download Pdf Manuals

Related Search

Related Contents