Home

Tensorlab User Guide

1. 6 L C Dixon and G P Szeg The global optimization problem an introduction pages 1 15 Towards global optimisation Il North Holland Pub Co 1978 7 R A Harshman Foundations of the PARAFAC procedure Models and conditions for an explanatory multi modal factor analysis UCLA Working Papers in Phonetics 16 1 84 84 1970 B F L Hitchcock The expression of a tensor or a polyadic as a sum of products Journal of Mathematical Physics 6 1 164 189 1927 9 F L Hitchcock Multiple invariants and generalized rank of a p way matrix or tensor Journal of Mathematical Physics 7 1 39 79 1927 47 REFERENCES 10 11 12 13 14 15 16 17 T G Kolda and B W Bader Tensor decompositions and applications SIAM Review 51 3 455 500 Sept 2009 L Sorber Domanov M Van Barel and L De Lathauwer Exact line and plane search for tensor optimization ESAT STADIUS Internal Report 13 02 Department of Electrical Engineering ESAT KU Leuven Leuven Belgium 2013 L Sorber M Van Barel and L De Lathauwer Unconstrained optimization of real functions in complex variables S AM Journal on Optimization 22 3 879 898 2012 L Sorber M Van Barel and L De Lathauwer Optimization based algorithms for tensor decompositions canonical polyadic decomposition decomposition in rank Lr Lp 1 terms and a new generalization SIAM Journal on Optimization 2
2. Choosing a as ones size x in the example functions above the following example uses deriv to compute the real gradient of these functions using the trick Three test functions f1 Q x 1e 20 1 1e3 x f2 x sin x xones size x 3 f3 e XY atan trace XY 1 XY 2 32 7 COMPLEX OPTIMIZATION Their first order derivatives gl x 1e 17 1 1e3 x 2 g2 x 3xsin x xones size x 2 cos x ones size x ones size x g3 XY 1 1 trace XY 1 XY 2 2 amp XY 2 1 1 trace XY 1 XY 2 2 XY 1 Approximate the real gradient with the i trick and compute its relative error x randn relerr1 abs g1 x deriv f1 x abs g1 x x randn 10 1 relerr2 norm g2 x deriv f2 x norm g2 x XY randn 1 randn 1 relerr3 cellfun a b frob a b frob a g3 XY deriv f3 XY In Tensorlab derivatives of scalar functions are returned in the same format as the function s argument Notice that f3 is function of a cell array XY containing the matrix X in XY 1 and the matrix Y in XY 2 In similar vein the output of deriv f3 XY is a cell array containing the matrices os and oF Often this allows the user to conveniently write functions as a function of a cell array of variables containing vectors matrices or tensors instead of coercing all variables into one long vector which must then be disassembled in the respective variables Scalar functions with finite d
3. U V LX model factorizations yfac data Y model factorizations yfac cpd U W LY Here we use the CPD to describe the factorizations X U Ax V and Y U Ay WT by associating the factor matrices Ax and Ay with the third dimension of X and Y Alternatively we could describe the two factorizations with the BTD model by imposing the core tensor to be a diagonal matrix with struct_diag but this is less efficient than using the CPD model The SDF problem can now be solved with Solve the SDF problem options Display 5 View convergence progress every 5 iterations options TolFun 1e 9 Stop earlier sol sdf_nls model options sol variables sol factors 22 6 STRUCTURED DATA FUSION Although the algorithm converges to a solution it usually does not converge to the solution we used to generate the problem This indicates that there are not enough constraints to make the solution to this SDF problem unique in contrary to the first example Example 3 an orthogonal factor We show how to compute a simple CPD in which one of the factors is constrained to have orthonormal colums To this end we will create a variable q that parameterizes a matrix with orthonormal columns Q The structure struct orth can then be used to transform the variable q into the factor Q cf Figure 6 1 First we set up the problem 39 Generate CPD wherein one factor has orthonormal columns I 15 R 4 U cpd_rn
4. struct toeplitz struct nonneg atop abtm model factors A atop struct_nonneg struct_toeplitz abtm model factors B b Third factor matrix is known model factors C U 3 Define model factorizations model factorizations myfac data T model factorizations myfac cpd A B C Solve the SDF problem options Display 5 View convergence progress every 5 iterations sol sdf_nls model options Example 7 regularization Next to the CPD and BTD models SDF also includes two mod els which represent L1 and L2 regularization terms By including the factorizations Use L2 regularization on the factors A and B model factorizations myreg2 data zeros size U 1 zeros size U 2 model factorizations myreg2 regL2 A B Use L1 regularization on C ones size C model factorizations myregl data ones size U 3 model factorizations myreg1 regL1 C to the previous example the terms 1 vec A vec B Q and 3 vec C 1 are added to the objective function The data field defaults to all zeros if omitted 27 6 STRUCTURED DATA FUSION 6 2 Implementing a new factor structure Function signature To add your own structure to impose on factors in an SDF model create a function with the following function signature function x state struct mystruct z task end Function evaluation This function will be called in three different ways by
5. View convergence progress every 5 iterations sol sdf_nls model options sol variables sol factors Example 4b modelling a rank L L 1 BTD with a CPD A different and more efficient approach to compute the rank L L 1 BTD would be to cast it as a CPD as in L1 L2 T amp A1 BlT oci A2 B2 o c2 2 M Mj oB1 j o c1 A2 0B2 0c2 j 1 j l Here the CPD s third factor matrix looks like repmat c1 1 L1 repmat c2 1 L2 while the first two would be A1 A2 and B1 B2 respectively These factor matrices are all built out of subfactor matrices which we can implement in the DSL as follows 24 6 STRUCTURED DATA FUSION Define rank Lr Lr 1 BTD model using the CPD model variables A1 randn size tens 1 L1 model variables B1 randn size tens 2 L1 model variables c1 randn size tens 3 1 model variables A2 randn size tens 1 L2 model variables B2 randn size tens 2 L2 model variables c2 randn size tens 3 1 The factors A B and C are built out of subfactors Structure may also be imposed on subfactors if desired model factors A Al A2 model factors B Bl B2 model factors C cl cl c2 c2 model factorizations mybtd data T model factorizations mybtd cpd A B C Let s deconstruct the meaning of this syntax model factors A Al A2 says that the factor A is a matrix consisting of two subfactors placed horizontally next
6. 1e 6 options MaxIter 100 z minf_lbfgs f grad z options See the help information on the selected algorithm for more details Viewing the algorithm output The second output of the optimization algorithms returns additional information pertaining to the algorithm For example the algorithms keep track of the objective function value in output fval and also output the circumstances under which the algorithm terminated in output info As an example the objective function value of each iterate can be plotted with Plot each iterate s objective function value z output minf lbfgs Gf Ggrad z0 semilogy 0 output iterations output fval See the help information on the selected algorithm for more details 41 8 GLOBAL MINIMIZATION OF BIVARIATE FUNCTIONS 8 Global minimization of bivariate functions Analytic bivariate polynomials Consider the problem of minimizing a bivariate polyno mial minimize p x y bipol xyER or more generally a rational function BEL minimize xyeR q x y Since all local minimizers x y are stationary points they are roots of the system of birat bivariate polynomials Op q ox Pax Op q _ ay ay 0 where q x y 1 in the case of minimizing a bivariate polynomial With polysol2 Tensorlab offers a numerically robust way of computing isolated real roots of a system of bivariate polynomials f x y 0 g x y 0 as the eigenvalues of a generali
7. Flanders FWO G 0828 14N and by 3 the Belgian Network DYSCO Dynamical Systems Control and Optimization funded by the Interuniversity Attraction Poles Programme initiated by the Belgian State Science Policy Office Lieven De Lathauwer is supported by 1 the Research Council KU Leuven a GOA MaNet b CoE EF 05 006 Optimization in Engineering OPTEC c CIF1 d STRT1 08 023 2 the Research Foundation Flanders FWO a G 0427 10N b G 0830 14N c G 0881 14N and by 3 the Belgian Network DYSCO IUAP P7 19 References 1 R Bro Multi way analysis in the food industry models algorithms and applications PhD thesis University of Amsterdam 1998 2 J D Carroll and J J Chang Analysis of individual differences in multidimensional scaling via an n way generalization of Eckart Young decomposition Psychometrika 35 3 283 319 1970 3 L De Lathauwer Decompositions of a higher order tensor in block terms Part Lemmas for partitioned matrices SIAM Journal on Matrix Analysis and Applications 30 3 1022 1032 2008 4 L De Lathauwer Decompositions of a higher order tensor in block terms Part Il Definitions and uniqueness SIAM Journal on Matrix Analysis and Applications 30 3 1033 1066 2008 L De Lathauwer B L R De Moor and J Vandewalle A multilinear singular value decomposition S AM Journal on Matrix Analysis and Applications 21 4 1253 1278 2000 5
8. O n floating point operations flop to compute the solution Instead we will focus on a nonlinear extension of this equation to low rank solutions X which enables us to solve large scale Lyapunov equations From here on X is represented as the matrix product U V where U C V e C n k n In the framework of minf and nls we define the objective function and residual function as fias U V 5 I Fvap U VDI f yap and Fiyap U V A U V c U V A Q F lyap respectively Remark Please note that this example serves mainly as an illustration and that com puting a good low rank solution to a Lyapunov equation proves to be quite difficult in practice due to an increasingly large amount of local minima as k increases 7 1 Complex derivatives 7 1 1 Pen amp paper differentiation Scalar functions To solve optimization problems of the form minf many algorithms require first order derivatives of the real valued objective function f For a function of real variables fr R R these derivatives can be captured in the gradient ofr For example R X let f x sin x x 2xl a for a x R then its gradient is given by df E cos x x 2x a 2x 2a Things get more interesting for real valued functions of complex variables Let f z sin z z Z z a 30 7 COMPLEX OPTIMIZATION where z C and an overline denotes the complex conjugate of its argume
9. SDF algorithms The first is simply evaluating the structure given the variable z x state struct mystruct z From here on we will assume the transformation maps z to x sqrt z Thus an early implementation could look like function x state struct mystruct z task if nargin lt 2 isempty task x sqrt z state end end Right Jacobian vector product Once the function has been evaluated at a certain z it will be called several times to evaluate the Jacobian vector product dvec Z dvec z T at z given a vector r stored in task r The latter will not actually be stored as a vector but rather in the same format as the input variable z be it an array or nested cell array of arrays Likewise the result of the right Jacobian vector product x should be in the same format as the output of the evaluation struct mystruct z Moreover the second output state from the function evaluation stage can be used to store computations in which will then be made available as fields in task Taking this into account our transformation could look like function x state struct mystruct z task if nargin lt 2 isempty task x sqrt z state dz 1 2 x elseif isempty task r Here task dz is equal to 1 2 sqrt z which we stored earlier x task dz task r state 1 end end Left Jacobian vector product Finally the function will also be called to evaluate the left Jacobian vector product o
10. Science KU Leuven Celestijnenlaan 200A BE 3001 Leuven Belgium Laurent Sorber cs kuleuven be Marc VanBarel cs kuleuven be t Group Science Engineering and Technology KU Leuven Kulak E Sabbelaan 53 BE 8500 Kortrijk Belgium Lieven DeLathauwer kuleuven kulak be tSTADIUS Department of Electrical Engineering ESAT KU Leuven Kasteelpark Arenberg 10 BE 3001 Leuven Belgium Lieven DeLathauwer esat kuleuven be SiMinds Future Health Department Kasteelpark Arenberg 10 BE 3001 Leuven Belgium 1 GETTING STARTED 1 Getting started What is Tensorlab n short Tensorlab is a MATLAB toolbox for rapid prototyping of coupled tensor decompositions with structured factors In Tensorlab data sets are stored as possibly incomplete vectors matrices and higher order tensors By mixing different types of decompositions and factor structures a vast amount of factorizations can be computed Users can define their own factor structures or choose from the library of preimplemented structures including nonnegativity orthogonality Toeplitz and Vandermonde matrices to name a few Section 2 covers how dense incomplete and sparse data sets are represented as tensors in Tensorlab and the basic operations on such tensors Tensor decompositions such as the canonical polyadic decomposition CPD low multilinear rank approximation LMLRA and block term decompositions BTD are discussed in Sections 3 4 and 5 respectively
11. T The relative error between the tensor and its BTD can then be computed as relerr frob btdres T Uhat frob T 6 Structured data fusion Variables Factors Factorizations Zi l y iad Lian a Q 23 B Q M L TO Figure 6 1 Schematic of structured data fusion The vector z upper triangular matrix Z and full matrix z3 are transformed into a Toeplitz orthogonal and nonnegative matrix respectively The resulting factors are then used to jointly factorize two coupled data sets Structured data fusion SDF 14 is the practice of jointly factorizing one or more coupled data sets while optionally imposing structure on the factors Each data set stored as a dense sparse or incomplete tensor cf Section 2 in a data fusion problem can be factorized with a different tensor decomposition Currently the user has the choice of the CPD and BTD models and with a bit of effort it is easy to add new models as well Structure can be imposed on the factors in a modular way and the user can choose from a library of predefined structures such as nonnegativity orthogonality Hankel Toeplitz Vandermonde matrix inverse and many more See Contents m for a complete list By selecting the right structures you can even compute classical matrix decompositions such as the QR factorization eigenvalue decomposition and singular value decomposition 19 6 STRUCTURED DATA FUSION 6 1 Domain specific language
12. The following example applies rankest on the amino acids dataset 1 url http www models life ku dk go engine filename amino mat urlwrite url amino mat Download amino mat in this directory load amino X rankest X The resulting figure is shown in Figure 3 2 The algorithm computes the CPD of the given tensor for various values of R starting at the smallest integer for which the lower bound on the relative error displayed as a solid blue line is smaller than the specified 12 4 LOW MULTILINEAR RANK APPROXIMATION options MinRelErr The lower bound is based on the truncation error of the tensor s multilinear singular values 5 For incomplete and sparse tensors this lower bound is not available and the first value to be tried for R is 1 The number of rank one terms is increased until the relative error of the approximation is less than options MinRelErr In a sense the corner of the resulting L curve makes an optimal trade off between accuracy and compression The rankest tool computes the number of rank one terms R corresponding to the L curve corner and marks it on the plot with a square This optimal number of rank one terms is also rankest s first output By capturing it as R rankest X the L curve is no longer plotted 0 10 8 L curve corner at R 8 CPD error Lower bound on error E a o Eo 10r 1 2 E o 2 E Q 2 Q o 2 10 t M MinRelErr L L L 1 L L F
13. and the latter analytic Case 7 1 2 Numerical differentiation Real scalar functions with the trick The real gradient can be numerically approx imated with deriv using the so called trick 16 For example define the scalar func tions 0720 A Less 103x h x sin x a R X Y arctan trace XT Y where x R a x R and X Y R Their first order derivatives are Ofs i Y dfi Hcr df T2 T Ox IFtrace X Y a dc E dx 3sin x a cos x a a af OY THtrace XT Y 2 X An advantage of using the trick is that it can compute first order derivatives accurately up to the order of machine precision The disadvantages are that this requires an equivalent of about 4 real function evaluations per variable compared to 2 for finite differences and that certain requirements must be met First only the real gradient can be computed meaning the gradient can only be computed where the variables are real Second the function must be real valued when evaluated in real variables Third the function must be analytic on the complex plane In other words the function may not be a function of the complex conjugate of its argument For example the trick can be used to compute the gradient of the function x x x but not of the function x x x because the latter depends on x As a last example note that x real x is not analytic in x C because it can be written as x x conj 2 72
14. computed with polyder p and its zeros can be computed with roots p The stationary points of p x i e all x which satisfy de x 0 are the output of roots polyder p However some of these solutions may have a small imaginary part which correspond to a numerical zero The stationary points can also be computed as roots of the polynomial s derivative with polymin p which deals with solutions with small imaginary part and returns only real solutions Analogously the stationary points of a a rational function eG are given by roots conv polyder p q conv p polyder q where conv p q is the convolution of the two row vectors p and q and is equivalent to computing the coefficients of the polynomial p x g x As in the polynomial case there are a few numerical issues which can be dealt with by computing the stationary points of 209 with ratmin p q Bivariate polynomials and rational functions In Tensorlab a bivariate polynomial p x y is represented by a matrix p as aoo add p x y 1 y ce y ad 0 eee dg dy sete For example the six hump camel back function 6 lI p x y 4x 21x 4 xy Ay 4y is represented by the matrix palo 0 4 0 210 195 0 1 0 0 O0 O9 0 0 0 0 0 0 0 0 0 6 4 0 0 0 o O0 and can be seen in Figure 8 1 The stationary points of the polynomial p x y can be computed as the solutions of the system op 2 0 with polymin2 p To obtain a glo
15. gradient f an expression for the scaled conjugate co gradient is available it can be supplied to the optimization algorithm in the second argument For the Lyapunov equation the implementation could look like function z lyap minf grad A Q z0 AHA A xA z minf_lbfgs f grad zQ function f f z U z 1 V z 2 F AxU V Ux V A 0 f 5 F F end function g grad z 40 7 COMPLEX OPTIMIZATION RES IDCM VW 2s gU AHAX UX VxV A UX VxXA xV A QxV Ax Ux VXAXV YU VXAHAxV Q AxV gV U xAHAXxU V U A xU V A U xA xQ U xAxU V A U XU V AHA U Q A B aU eas end end The function grad z computes the scaled conjugate cogradient 2 Mae which coincides with the real gradient for z IR See Section 7 1 for more information on complex derivatives Remark Note that the gradient grad z is returned in the same format as the solution z i e as a cell array containing matrices of the same size as U and V However the gradient may also be returned as a vector if this is more convenient for the user In that case the scaled conjugate cogradient should be of the format guum where a Z i vec u vec V Setting the options The parameters of the selected optimization algorithm can be set by supplying the method with an options structure e g Set algorithm tolerances options TolFun 1e 6 options TolX
16. s mode n vectors are premultiplied by a given matrix In other words Uxtens2mat T n is a mode n matriciza tion of the mode n tensor matrix product T U The function tmprod T U mode computes the tensor matrix product T modec1 UL1 modec2 U 2 For example T randn 3 5 7 U randn 11 3 randn 13 5 randn 15 7 S tmprod T U 1 3 size S outputs 11 13 15 To compute a single tensor matrix product it is not necessary to use a cell array The following is an example of a single mode 2 tensor matrix product 2 DATA SETS DENSE INCOMPLETE AND SPARSE TENSORS T randn 3 5 7 S tmprod T randn 13 5 2 Kronecker and Khatri Rao product Tensorlab includes a fast implementation of the Kronecker product A amp B with the function kron A B which overrides MATLAB s built in implementation Let A and B be matrices of size by J and K by L respectively then the Kronecker product of A and B is the K by JL matrix a11B LET a1jB AG B n ai B ves aj jB The Khatri Rao product A B can be computed by kr A B Let A and B both be matrices with N columns then the Khatri Rao product of A and B is the column wise Kronecker product AOB a amp bi ay byl More generally kr A B C and kr U compute the string of Khatri Rao products A B OC O and U 1 U 2 UL3 respectively Visualization Tensorlab offers three methods to visualize dense third order tensors slice3 sur
17. to each other Subfactors may also be placed vertically by using a semicolon instead of a comma to separate them The SDF algorithm will first build the subfactors which are in this case simply references to the variables A1 and A2 and then call cell2mat on the cell array of subfactors to create the factor A Like factors structure may also be imposed on subfactors For instance we could have written model factors A Al struct_sigmoid A2 to constrain the elements of the first submatrix in the factor A to the interval 1 1 the elements of the variable A1 itself are not restricted to this interval however Because computing the rank L L 1 BTD is a relatively common task we show a final compact and efficient way to do so using the predefined structure struct LL1 Define rank Lr Lr 1 BTD model using the CPD more efficient L pi Laie LL1 z task struct_LL1 z task L model variables A randn size_tens 1 L1 L2 model variables B randn size_tens 2 L1 L2 model variables c randn size_tens 3 2 model factors A A model factors B Bene model factors C c LL1 model factorizations mybtd data T model factorizations mybtd cpd A B C M Example 5 constants When a factor or subfactor is intended to be constant then instead of referring to a variable by its name or index use the constant array itself Example 5a known factor matrix The following example computes a CP
18. 3 2 695 720 2013 L Sorber M Van Barel and L De Lathauwer Structured data fusion ESAT STADIUS Internal Report 13 177 Department of Electrical Engineering ESAT KU Leuven Leuven Belgium 2013 L Sorber M Van Barel and L De Lathauwer Numerical solution of bivariate and polyanalytic polynomial systems SIAM Journal on Numerical Analysis 2014 Accepted W Squire and G Trapp Using complex variables to estimate derivatives of real functions SIAM Review 10 1 110 112 Mar 1998 L R Tucker Some mathematical notes on three mode factor analysis Psychometrika 31 3 279 311 1966 48
19. D in which one factor matrix is known Generate a CPD 15 4 cpd rnd I I I R cpdgen U G amp G ze lei I Lu 25 6 STRUCTURED DATA FUSION Model the CPD of T where the second half of U 3 is known model variables a randn size U 1 model variables b randn size U 1 model variables c randn size U 1 1 2 model factors A aie model factors B MD model factors C U 3 The third factor is constant model factorizations myfac data T model factorizations myfac cpd A B C Solve the SDF model options Display 5 View convergence progress every 5 iterations sol sdf_nls model options sol variables sol factors Example 5b partially known factor matrix To create a factor of which some of the columns are known simply define the factor to consist of a number of subfactors where one of the subfactors is a numeric array The following example computes a CPD in which part of one factor matrix is known Generate a CPD 15 g cpd_rnd I I I R cpdgen U je zs f iu Model the CPD of T where the second half of U 3 is known model variables a randn size U 1 model variables b randn size U 1 model variables c randn size U 1 1 2 model factors A a model factors B b model factors C c U 3 1 2 The third factor is partially known model factorizations myfac data T model factorizations myfa
20. Generating pseudorandom factor matrices and core tensor A cell array of block terms U U 1 U 2 in which the rth term has a core tensor U r Hend of size size_core r corresponding to a BTD of a tensor of dimensions size tens can be generated with size tens 17 19 21 size core 3 5 7 6 3 5 4 3 415 U btd rnd size tens size core By default btd rnd generates U r n using randn after which the factor matrices U r 1 N are orthogonalized Other generators can also be specified with an options structure e g options Real rand options Imag rand U S btd rnd size tens size core options and its inline equivalent U S btd rnd size tens size core struct Real Grand Imag Grand Generating the associated full tensor Given a cell array of block terms U U 1 U 2 its associated full tensor T can be computed with T btdgen U 17 5 BLOCK TERM DECOMPOSITION This is equivalent to R length U N length U 1 1 size tens cellfun size U 13 1 N 1 T zeros size tens for r 1 R T T tmprod U r N 1 U r 1 N 1 N end 5 2 Computing a BTD With a specific algorithm Currently the user must generate his or her own initialization and select a specific family of algorithm s such as btd_nls or btd_minf to compute the BTD given the initialization To compute a BTD of a dense sparse or incomplete tensor T given an initialization U run btd nls T U0 For examp
21. RA vec X AX Y Oven _ a a 0 vec x T vec Y 0 0 0 aa vam vec Y 0 JS X Y avec a a 0 vec X T vec Y 0 0 0 a a respectively For a nonanalytic tensor valued function F z deriv F z L1 Jacobian C computes a finite differences approximation of the complex Jacobian 22 ucl dz oz comprising both the Jacobian and conjugate Jacobian The complex Jacobian of F2 X Y is the matrix A x Y JS X Y In the following example a is equal to ones length z 1 and the relative error of the complex Jacobian s finite differences approximation is com puted Approximate the complex Jacobian of a nonanalytic tensor valued function with finite differences and compute its relative error F2 e z log trace z 1 z 2 sum sum z 1 conj z 1 sum sum z 2 conj z 2 G z Lzeros 1 numel z 1 1 trace z 1 z 2 reshape conj z 1 1 1 trace z 1 z 2 reshape z 2 1 0 zeros 1 numel z 2 ones 1 numel z 13 zeros 1 numel z 1 ones 1 numel z 2 zeros 1 numel z 2 zeros 1 2xnumel z 13 zeros 1 2 numel z 2 zeros 1 numel z 1 ones 1 numel z 1 zeros 1 numel z 1 ones 1 numel z 1 z randn 10 randn 10 1i randn 10 randn 10 1i relerr2 frob J2 z deriv F2 z Jacobian C frob J2 z J2 35 7 COMPLEX OPTIMIZATION 7 2 Nonlinear least squares Algorithms Tensorlab offers three algorithms for unco
22. Structured data fusion SDF with which multiple data sets can be jointly factorized while imposing structure on the factors is covered in Section 6 Many of the algorithms to accomplish these tasks are based on complex optimization that is optimization of functions in complex variables Section 7 introduces the necessary concepts and shows how to solve different types of complex optimization problems Finally Section 8 treats global optimization of bivariate and polyanalytic polynomials and rational functions which appear as subproblems in tensor optimization Installation Unzip Tensorlab to any directory browse to that location in MATLAB and run addpath pwd Add the current directory to the MATLAB search path savepath Save the search path for future sessions Requirements Tensorlab requires MATLAB 7 9 R2009b or higher because of its depen dency on the tilde operator If necessary older versions of MATLAB can use Tensorlab by replacing the tilde in and with tmp To do so on Linux OS X browse to Tensorlab and run sed i s tmp g s tmp g m in your system s terminal However most of the functionality in Tensorlab requires at the very minimum MATLAB 7 4 R2007a because of its extensive use of bsxfun Octave is only partially supported mainly because it coerces nested functions into subfunc tions The latter do not share the workspace of their parent function which is a feature used by Tensorlab in c
23. Tensorlab User Guide Laurent Sorber 8 Marc Van Barel Lieven De Lathauwer 8 Contents 1 Getting started 1 2 Data sets dense incomplete and sparse tensors 5 21 Representations cem RM uS SUR a Ot ee ee eo Se 5 2 2 Fensor operations amp i id ceno deel ee eee ook AR eR woe eh RU 6 3 Canonical polyadic decomposition 10 3 1 Problem and tensor generation 2l lle 10 3 2 Computing the CPD aucem Be Me Grd el aoe Ra eae EROR Se das 11 3 3 Choosing the number of rankcone terms R aoaaa aa 12 4 Low multilinear rank approximation 13 4 1 Problem and tensor generation 2 a a eee 14 4 2 Computing a LMERA scan pom koe doe e doku gui RO oh eo Xd Bes 14 4 3 Choosing the size of the core tensor 2 ole 16 5 Block term decomposition 17 5 1 Problem and tensor generation 2 a a a 17 5 2 e Computing a B LD ees ig ge RA ce als Re Bed elastin ke Ah 18 6 Structured data fusion 19 6 1 Domain specific language for SDF 2 aaa 20 6 2 Implementing a new factor structure ooo oao a 000 0004 28 7 Complex optimization 29 7 1 Complex derivatives ooh 30 7 2 Nonlinear least squares aa a a a a a a 36 7 3 Unconstrained nonlinear optimization aoa a aa a a a a a a a a 40 8 Global minimization of bivariate functions 42 8 1 Stationary points of polynomials and rational functions 2 43 8 2 Isolated solutions of a system of two bivariate polynomials 45 NALAG Department of Computer
24. UCTURED DATA FUSION where c1 and c2 are vectors and size A1 2 equals size B1 2 equals 1 and analogously for Lo Example 4a modelling a rank L L 1 BTD with a BTD We could compute this decomposition by formulating it as a more general BTD Tz A1 S1 B1 ocl A2 S2 B2 oc2 S1 Al 5 Bl 3 c1 S2 A2 2 B2 3 c2 where we have introduced the core tensors S1 and S2 First we create the tensor T with Generate rank Lr Lr 1 BTD size tens 10 10 10 Lil 28 U btd rnd size tens L1 L1 1 L2 L2 1 T btdgen U then we define the model as Define rank Lr Lr 1 BTD model using the BTD model variables A1 randn size tens 1 L1 model variables B1 randn size tens 2 L1 model variables c1 randn size tens 3 1 model variables S1 randn L1 L1 1 model variables A2 randn size tens 1 L2 model variables B2 randn size tens 2 L2 model variables c2 randn size tens 3 1 model variables S2 randn L2 L2 1 modele factors A VAI Bille SeT ESIE ACu B2 e2 Goins model factorizations mybtd data T model factorizations mybtd btd 1 2 3 4 5 6 7 8 where for the sake of brevity we have used the index notation for the factors field Notice that the BTD is specified in the same format as for other BTD algorithms except in place of factor matrices and core tensors there are references to factors The BTD can then be computed by Solve the SDF problem options Display 5
25. actors as transformed variables model factors U u struct_nonneg Create U as struct_nonneg u The factor U is built taking the variable u and applying the transformation struct_nonneg In fact factors can be built with a much more complex structure using subfactors see the following examples for details Finally we define the data set to be factorized and which factors to use Define model factorizations model factorizations myfac data T model factorizations myfac cpd U U U Each factorization in the SDF problem should be given a new name In this case there is only one factorization myfac and it contains two fields The first is data and contains the tensor to be factorized The second should be either cpd or btd depending on 20 6 STRUCTURED DATA FUSION which model to use and should define the factors to be used in the decomposition Note that it is not necessary to use fields to describe the names of the variables and factors Instead one may also create cell arrays of variables and factors and use indices to refer to them In this format the model would be written as Equivalent SDF model without using names for variables and factors model variables randn I R model factors 1 struct_nonneg model factorizations myfac data T model factorizations myfac cpd 1 1 1 The model can now be solved with one of the two families of algorithms for SDF problems sdf_mi
26. bal minimum select the solution with smallest function value using xy v polymin2 p vmin idx min v xymin xy idx 43 8 GLOBAL MINIMIZATION OF BIVARIATE FUNCTIONS To visualize the level zero contour lines of oe and ab in the neighbourhood of the stationary points set options Plot true as follows p randn 6 Generate random bivariate polynomial of coordinate degree 5 options Plot true Xy polymin2 p options or inline as polymin2 randn 6 struct Plot true p xy Figure 8 1 The six hump camel back function and its stationary points as red dots The corresponding system of polynomials is solved by polysol2 The latter includes several balancing steps to improve the accuracy of the solution For some poorly scaled problems the method may fail to find all solutions of the system In that case try decreasing the polysol2 balancing tolerance options TolBal to a smaller value e g options TolBal 1e 4 xy polymin2 p options Computing the stationary points of a bivariate rational function gir is completely analogous to the polynomial case The following example generates a random bivariate rational function and computes its stationary points p randn 6 Random bivariate polynomial of coordinate degree 5 q randn 4 Random bivariate polynomial of coordinate degree 3 options Plot true xy ratmin2 p q options The inline equivalen
27. c cpd A B C Solve the SDF model options Display 5 View convergence progress every 5 iterations sol sdf_nls model options sol variables sol factors Example 6 chaining factor structures Factor structures can be chained so that a variable is transformed by a sequence of functions In the following example we compute a CPD in which a subfactor of the first factor matrix is a nonnegative Toeplitz matrix and the last factor is known To create a nonnegative Toeplitz subfactor we will create a generator vector for the Toeplitz matrix transform it with struct_nonneg so that it is nonnegative and then transform it with struct_toeplitz to create a nonnegative Toeplitz subfactor The factor A is defined as a matrix consisting of two subfactors the top subfactor transforms the variable atop with struct_nonneg followed by struct_toeplitz and the bottom subfactor is 26 6 STRUCTURED DATA FUSION simply abtm Generate CPD wherein the first factor has a nonnegative Toeplitz subfactor the last factor is known I 15 R 4 U cpd rnd I I I R U 1 1 R 1 R toeplitz rand 4 1 T cpdgen U Define model variables model variables atop randn 2 R 1 1 model variables abtm randn I R R model variables b randn I R Define model factors The first factor vertically concatenates two subfactors The top subfactor is generated as a nonnegative Toeplitz matrix i e A
28. ced line search cpd els CPD exact line search cpd eps CPD exact plane search cpd_lsb CPD line search by Bro Utilities cpderr Errors between factor matrices in a CPD cpdgen Generate full tensor given a polyadic decomposition cpdres Residual of a polyadic decomposition rankest Estimate rank COMPLEX OPTIMIZATION Nonlinear least squares nls_gncgs Nonlinear least squares by Gauss Newton with CG Steihaug nls gndl Nonlinear least squares by Gauss Newton with dogleg trust region nls lm Nonlinear least squares by Levenberg Marquardt nlsb gndl Bound constrained NLS by projected Gauss Newton dogleg TR Unconstrained nonlinear optimization minf lbfgs Minimize a function by L BFGS with line search minf lbfgsdl Minimize a function by L BFGS with dogleg trust region minf ncg Minimize a function by nonlinear conjugate gradient minf sricgs Minimize a function by SR1 with CG Steihaug Utilities deriv Approximate gradient and Jacobian ls mt Strong Wolfe line search by More Thuente mpcg Modified preconditioned conjugate gradients method 1 GETTING STARTED LOW MULTILINEAR Algorithms Imlra Imlra_hooi lmlra minf Imlra_nls Imlra3_dgn Imiras ntr mlsvd Initialization lmlra_aca Imlra_rnd Utilities Imlraerr Imlragen Imlrares mlrank mlrankest STRUCTURED DATA Algorithms sdf_minf optimization sdf_nls Structure
29. d I I I R U 1 orth U 1 Enforce orthonormal columns T cpdgen U The help information provided by struct_orth tells us that the variable q should be a vector of length R 0 5R R 1 To transform q into a matrix with orthonormal columns we need to pass the size of Q to struct_orth One way to do this is with an anonymous function as shown below Now we are ready to define and solve the SDF problem Define model variables model variables q randn 1 R 5 R R 1 1 model variables b randn I R randn I R model variables c Define a function which will transform q into Q This anonymous function stores the size of Q as its last argument orthq G z task struct orth z task I R Define model factors model factors Q q orthq Create Q as orthq q model factors B b C model factors C Define model factorizations model factorizations myfac data T model factorizations myfac cpd Q B C Solve the SDF problem options Display 5 View convergence progress every 5 iterations sol sdf_nls model options Check that Q indeed has orthonormal columns sol factors Q sol factors Q Example 4 rank L L 1 BTD This example shows how to use the BTD model and how to use subfactors to build more complex factors in SDF models We want to compute a rank L L 1 BTD 3 4 13 of a tensor T in two terms i e T z A1 B1 oci aA2 B2 oc2 23 6 STR
30. dient Of Of g z 255 2x and can optimize f over both z C and z R Vector valued functions To solve optimization problems of the form nls first order derivatives of the vector valued or more generally tensor valued residual function F are often required For a tensor valued function F R gt R these derivatives can be captured in the Jacobian ove For example let sin x x x b xla cos x x for a b x R then its Jacobian is given by cos x x 2x d vec Jg ar dxt bl sin x x 2x But what happens when we allow C gt C v 7 For example sin z z z b zla cos z z 31 7 COMPLEX OPTIMIZATION where z C could be a generalization of Fp to the complex plane Following a similar reasoning as for scalar functions f we can define a Jacobian and conjugate Jacobian as Ove and deg respectively For the example above we have Oz cos z T z 2zT oT Ovec F _ ot Ovec F _ at Oz b az oT sin z z z sin z z lt z Because F maps to the complex numbers it is no longer true that the conjugate Jacobian is the complex conjugate of the Jacobian In general algorithms that solve nls require both the Jacobian and conjugate Jacobian In some cases only one of the two Jacobians is required e g when F is analytic in z which implies ove 0 Tensorlab offers nonlinear least squares solvers for both the general nonanalytic case
31. ertain algorithms Contents m If you have installed Tensorlab to the directory tensorlab run doc tensorlab from the command line for an overview of the toolboxes functionality or if that fails try help pwd Both commands display the file Contents m shown below Although this user guide covers the most important aspects of Tensorlab Contents m shows a short one line description of all exported functions GETTING STARTED 39 S S 39 S S 39 39 39 sk XN 39 39 TENSORLAB Version 2 02 2014 05 07 BLOCK TERM DECOMPOSITION Algorithms btd minf BTD by unconstrained nonlinear optimization btd nls BTD by nonlinear least squares Initialization btd_rnd Pseudorandom initialization for BTD Utilities btdgen Generate full tensor given a BTD btdres Residual of a BID CANONICAL POLYADIC DECOMPOSITION Algorithms cpd Canonical polyadic decomposition cpd als CPD by alternating least squares cpd minf CPD by unconstrained nonlinear optimization cpd nls CPD by nonlinear least squares cpd3 sd CPD by simultaneous diagonalization cpd3 sgsd CPD by simultaneous generalized Schur decomposition Initialization cpd gevd CPD by a generalized eigenvalue decomposition cpd rnd Pseudorandom initialization for CPD Line and plane search cpd aels CPD approximate enhan
32. f3 and voxel3 The following example demonstrates these methods on the amino acids dataset 1 url http www models life ku dk go engine filename amino mat urlwrite url amino mat Download amino mat in this directory load amino X figure 1 voxel3 X figure 2 surf3 X figure 3 slice3 X The resulting MATLAB figures can be seen in Figure 2 1 By default the voxel3 plot uses a fast but not so accurate renderer To enable the slow but accurate renderer use voxel3 X fast false See the help information for more rendering options 2 DATA SETS DENSE INCOMPLETE AND SPARSE TENSORS a voxel3 b surf3 c slice3 Figure 2 1 Three functions for visualizing third order tensors in Tensorlab 2 DATA SETS DENSE INCOMPLETE AND SPARSE TENSORS 2 2 2 For dense incomplete and sparse tensors Frobenius norm The Frobenius norm of a tensor is the square root of the sum of square magnitudes of its known elements Given a tensor T its Frobenius norm can be computed with frob T If the tensor is dense this is equivalent with norm T i e the two norm of the vectorized tensor The squared Frobenius norm can be computed with frob T squared Matricized tensor Kronecker product Often algorithms for computing tensor decompo sitions do not explicitly need matricized tensors or Kronecker products but rather the matri cized tensor Kronecker product tens2mat T n kron U end 1 n 1 n 1 1 1
33. finite differences 39 and compute its relative error f G z ones size z z conj z log z z g C z 2xones size z 2 z z z z randn 10 1 randn 10 1 1i relerr norm g z deriv f z norm g z Remark In case of doubt use deriv f z gradient to compute the scaled con jugate cogradient Running deriv f z will attempt to use the trick when z is real which can be substantially more accurate but should only be applied when f is analytic Real vector valued functions with the trick Analogously to scalar functions the real Jacobian of tensor valued functions can also be numerically approximated with deriv using the trick Take for example the following matrix valued function x 73e e EUN where a x R and its real Jacobian x Ag feli os cos x x x Set a equal to ones size x Since each entry in F x is real valued and not a function of X we can approximate the real Jacobian J x with the trick Approximate the Jacobian of a tensor valued function with the i trick and compute its relative error F1 x Llog x x 0 2xones size x x sin x x J1 x 2 1 x x x ones size x zeros size x cos x x x x randn 10 1 relerr1 frob J1 x deriv F1 x frob J1 x Analytic vector valued functions with finite differences The Jacobian of an ana lytic tensor valued function F z can be approximated with finite differences by calling deriv F
34. for SDF Tensorlab uses a domain specific language DSL for modelling structured data fusion problems The three key ingredients of an SDF model are 1 defining variables 2 defining factors as transformed variables and 3 defining the data sets and which factors to use in their factorizations cf Figure 6 1 Example 1 nonnegative symmetric CPD The DSL is best explained by example To start we ll compute the nonnegative symmetric CPD of an incomplete tensor T First generate the tensor Generate a nonnegative symmetric CPD 15 dlp rand I R U U U cpdgen U amp amp zu hi I Lu 39 Remove 10 of its entries T randperm numel T round 0 1x numel T NaN Format as incomplete tensor T fmt T Next create a structure model which defines the variables of the SDF problem In this case there is only one variable and its size is equal to that of the factor U The variables field defines the parameters which are optimized and is also used as initialization for the SDF algorithm Define model variables model variables u randn I R Here the variable u is defined as a MATLAB array lt is also perfectly valid to define variables as nested cell arrays of arrays if desired Now we need to define the factors There is only one factor in this CPD which we will define as a transformation of the variable u More specifically we will require the factor to be nonnegative Define model f
35. hat If the tensor is dense then the result is equivalent to cpdgen Uhat T The relative error between the tensor and its CPD approximation can then be computed as relerr frob cpdres T Uhat frob T One can also consider the relative error between the factor matrices U n that generated T and their approximations Uhat n computed by cpd Due to the permutation and scaling indeterminacies of the CPD the columns of Uhat may need to be permuted and scaled to match those of U before comparing them to each other The function cpderr takes care of these indeterminacies and then computes the relative error between the given two sets of factor matrices l e relerr cpderr U Uhat returns a vector in which the nth entry is the relative error between U n and Uhat n This method is also applicable when Uhat n is an under or overfactoring of the solution meaning Uhat n comprises fewer or more rank one terms than the tensor s rank respectively In the underfactoring case Uhat n is padded with zero columns and in the overfactoring case only the columns in Uhat n that best match those in U n are kept See the help information for more details 3 3 Choosing the number of rank one terms R To help choose the number of rank one terms R use the rankest tool Running rankest T on a dense sparse or incomplete tensor T plots an L curve which represents the balance between the relative error of the CPD and the number of rank one terms R
36. i P 252 F vec z Avec z T 28 7 COMPLEX OPTIMIZATION where the partial derivative with respect to z Z treats Z z as constant cf Section 7 1 and the vector Z is stored in task 1 The output x should be of the same format as the input z Here the second term in the left Jacobian vector product is zero and the full implementation of the structure becomes function x state struct_mystruct z task if nargin lt 2 isempty task x sqrt z state dz 1 2 x elseif isempty task r x task dz task r state elseif isempty task 1 x conj task dz task 1 state end end Remark Note that currently the sdf_nls family of algorithms does not accept nonana lytic structures i e structures for which the second term in the left Jacobian vector product is nonzero The sdf_minf family does support nonanalytic structures however For example sdf_nls currently does not support a structure of the form z Z but does support the structure z z Since sdf_minf does not use the right Jacobian vector prod uct a partial workaround is to implement the structure z z for the right Jacobian vector product so that sdf nls can still use this structure if z is real valued At the same time the left Jacobian vector product can be based on the original structure z x Z so that sdf minf can be used for both real and complex z 7 Complex optimization Optimization problems An integral part of Tensorlab c
37. ifferences If the conditions for the trick are not satisfied or if a scaled conjugate cogradient is required an alternative is using finite differences to approximate first order derivatives In both cases the finite difference approximation can be computed using deriv f x gradient As a first example we compute the relative error of the finite difference approximation of the real gradient of f fo and fa Approximate the real gradient with finite differences and compute its relative error x randn relerr1 abs gl x deriv f1 x gradient abs g1 x x randn 10 1 relerr2 norm g2 x deriv f2 x gradient norm g2 x XY randn 10 randn 10 relerr3 cellfun G a b frob a b frob a g3 XY deriv f3 XY gradient If f z is a real valued scalar function of complex variables deriv can compute its scaled conjugate cogradient g z For example let Of f z a z Z log z z g z 2 2 OF 2 a _ z Oz Oz Zz where a R and z C Since f is real valued and z is complex calling deriv f z is equivalent to deriv f z gradient and uses finite differences to approximate the scaled conjugate cogradient In the following example a is chosen as ones size z and the relative error of the finite difference approximation of the scaled conjugate cogradient g z is computed 33 7 COMPLEX OPTIMIZATION 39 Approximate the scaled conjugate cogradient with
38. igure 3 2 The rankest tool for choosing the number of rank one terms R in a CPD 4 Low multilinear rank approximation Figure 4 1 A low multilinear rank approximation of a third order tensor A low multilinear rank approximation LMLRA 10 17 approximates a tensor by a smaller core tensor and a set of factor matrices Let T and S be tensors of dimensions 3 x 5x x n and J x Jp x x Jy respectively let U n be matrices of size In x Jn Jn In for 1 n N and let denote the mode n tensor matrix product see Section 2 2 1 then TRI Se U 1 2U 2 3 gt en UCN is a LMLRA of the tensor T in the core tensor S and factor matrices U n A visual representation of this decomposition in the third order case is shown in Figure 4 1 13 4 LOW MULTILINEAR RANK APPROXIMATION 4 1 Problem and tensor generation Generating pseudorandom factor matrices and core tensor A cell array of pseudoran dom factor matrices U U 1 U 2 with orthonormal columns and a core tensor S of dimensions size core corresponding to a LMLRA of a tensor of dimensions size tens can be generated with M size tens 17 19 21 size core 3 5 7 U S lmlra rnd size tens size core By default 1mlra rnd generates U n and S using randn after which U n is orthogonalized Other generators can also be specified with an options structure e g options Real Grand options Imag Grand U S lmlra_rnd si
39. ion options Algorithm Glmlra nls Select NLS as the main algorithm options AlgorithmOptions TolFun 1e 12 Set stop criteria options AlgorithmOptions TolX 1e 12 Uhat Shat lmlra T size core options The structures options InitializationOptions and options AlgorithmOptions will be passed as options structures to the algorithms corresponding to initialization and algorithm steps respectively Viewing the algorithm output Each step may also output additional information specific to that step For instance most LMLRA algorithms such as 1mlra hooi will keep track of the number of iterations and the difference in subspace angle of every two successive iterates sangle To obtain this information capture the third output Uhat Shat output lmlra T size core options and inspect its fields for example by plotting the objective function value semilogy 0 output Algorithm iterations sqrt 2 output Algorithm fval xlabel iteration ylabel frob lmlrares T U grid on Computing the error The residual between a tensor T and its LMLRA defined by Uhat and Shat can be computed with res lmlrares T Uhat Shat If the tensor is dense then the result is equivalent to 1mlragen Uhat Shat T The relative error between the tensor and its LMLRA can then be computed as relerr frob lmlrares T Uhat Shat frob T If the factor matrices U n that generated T are known the subspace angle between them and their approxima
40. ivariate and real polyanalytic rational functions Third order cumulant tensor Fourth order cumulant tensor Shifted covariance matrices Dot product in K fold precision Format data set Frobenius norm Convert formatted data set to an array Khatri Rao product Kronecker product Tensorize a matrix Matricized tensor Kronecker product Matricized tensor Khatri Rao product Generate a noisy version of a given array Summation in K fold precision Matricize a tensor Vectorize a tensor Mode n tensor matrix product Tensorize a vector Visualize a third order tensor with slices Visualize a third order tensor s sparsity pattern Visualize a third order tensor with surfaces Visualize a third order tensor with voxels 2 DATA SETS DENSE INCOMPLETE AND SPARSE TENSORS 2 Data sets dense incomplete and sparse tensors 2 1 Representation Dense tensors Scalars vectors and matrices are zero one and two dimensional tensors respectively Arrays with three or more dimensions are called higher order tensors In Tensorlab data sets are represented as dense sparse or incomplete tensors A dense tensor is simply a MATLAB array e g A randn 10 10 or T randn 5 5 5 All functions in Tensorlab accept dense tensors and many but not all also accept incomplete and sparse tensors transparently Incomplete tensors An incomplete tensor is a data set in which some or most of the entries are unknown To create an incomp
41. lay true Show progress on the command line options Initialization Gcpd rnd Select pseudorandom initialization options Algorithm cpd_als Select ALS as the main algorithm options AlgorithmOptions LineSearch cpd_els Add exact line search options AlgorithmOptions TolFun 1e 12 Set algorithm stop criteria options AlgorithmOptions TolX 1e 12 Uhat cpd T R options The structures options InitializationOptions options AlgorithmOptions and options RefinementOptions will be passed as options structures to the algorithms correspond ing to initialization algorithm and refinement steps respectively For example in the example above cpd will call cod_als as cpd als T options Initialization T R options AlgorithmOptions Viewing the algorithm output Each step may also output additional information specific to that step For instance most CPD algorithms such as cpd als will keep track of the number of iterations and objective function value To obtain this information capture the second output 11 3 CANONICAL POLYADIC DECOMPOSITION Uhat output cpd T R options and inspect its fields for example by plotting the objective function value semilogy 0 output Algorithm iterations sqrt 2 output Algorithm fval xlabel iteration ylabel frob cpdres T U grid on Computing the error The residual between a tensor T and its CPD approximation defined by Uhat can be computed with res cpdres T U
42. le Generate pseudorandom BTD U and associated full tensor T size tens 17 19 21 size core 2 2 2 2 2 21 2 2 215 U btd rnd size tens size core T btdgen U 39 Generate an initialization UO and compute the BTD with nonlinear least squares UO noisy U 20 Uhat btd nls T U0 generates a tensor T as the sum of three real rank 2 2 2 tensors and then computes its BTD Setting the options The selected algorithm may be customizable by supplying the method with an options structure see the relevant help for more information e g 5 Show convergence progress every 5 iterations options Display options MaxIter 200 Set stop criteria options TolFun 1e 12 options TolX 1e 12 Uhat btd nls T UO options Viewing the algorithm output The selected method also returns output specific to the algorithm such as the number of iterations and the algorithm s objective function value To obtain this information capture the second output Uhat output btd nls T UO options and inspect its fields for example by plotting the objective function value semilogy 0 output iterations sqrt 2 output fval 18 6 STRUCTURED DATA FUSION xlabel iteration ylabel frob btdres T U grid on Computing the error The residual between a tensor T and its BTD defined by Uhat can be computed with res btdres T Uhat If the tensor is dense then the result is equivalent to btdgen Uhat
43. lete tensor define a structure containing the tensor s known elements their positions the tensor s size and an incomplete flag as T val randn 3 1 T sub 1 2 3 456 7 8 9 TSize t 9 Ge T incomplete true Lu and then call T fmt T to convert the tensor to Tensorlab format This creates a 9 x 9 x 9 tensor T with three known elements at positions 1 2 3 4 5 6 and 7 8 9 Alternatively the user may supply the linear indices in T ind instead of the subindices T sub To convert linear indices to subindices or vice versa see the MATLAB functions ind2sub and sub2ind respectively Another way to create an incomplete tensor is to create a dense tensor in which the unknown values are NaN For example T randn 5 5 5 T 1 31 end NaN T fmt T creates a 5 x 5 x 5 tensor T with diagonal elements equal to NaN and formats it as an incomplete tensor If there are no entries equal to NaN then fmt T returns T itself To convert an incomplete tensor back to a MATLAB array use ful T Sparse tensors A sparse tensor is a data set in which most of the entries are zero To create a sparse tensor define a structure containing the tensor s nonzero elements in the same way as for incomplete tensors and set the sparse flag T val randn 3 1 T sub 1 2 3 456 7 8 9 T size 9 9 9 T sparse true 2 DATA SETS DENSE INCOMPLETE AND SPARSE TENSORS before formatting the tensor with T fmt T A
44. may be necessary to specify what type of polynomial p is In that case set options Univariate to true if p is a real valued polyanalytic polynomial and false otherwise The same option also applies to ratmin2 8 2 Isolated solutions of a system of two bivariate polynomials The functions polymin2 and ratmin2 depend on the lower level function polysol2 to compute the isolated solutions of systems of bivariate polynomials bisys or systems of polyanalytic univariate polynomials unisys In the case bisys polysol2 p q computes the isolated real solutions of the system p x y q x y 0 A solution x y is said to be isolated if there exists a neighbourhood of x y in C that contains no solution other than x y Some systems may have solutions that are isolated in R but not in C 45 8 GLOBAL MINIMIZATION OF BIVARIATE FUNCTIONS Remark By default polysol2 p q assumes p and q are polyanalytic univariate if at least one of p and q contains complex coefficients If this is not the case the user can specify the type of polynomial by setting options Univariate to true if both polynomials are polyanalytic univariate or false otherwise The algorithm in polysol2 applies several balancing steps to the problem in order to improve the accuracy of the computed roots before refining them with Newton Raphson If the system is poorly scaled it may be necessary to decrease the balancing tolerance options TolBal to a smaller value I
45. mplement these matrix vector products read the help information of the desired nls solver Setting the options The parameters of the selected optimization algorithm can be set by supplying the method with an options structure e g Set algorithm tolerances options TolFun 1e 12 options TolX 1e 6 options MaxIter 100 dF dz J z nls gndl GF dF z0 options Remark It is important to note that since the objective function is the square of a residual norm the objective function tolerance options TolFun can be set as small as 1073 for a given machine epsilon of 10719 See the help information on the selected algorithm for more details Viewing the algorithm output The second output of the optimization algorithms returns additional information pertaining to the algorithm For example the algorithms keep track of the objective function value in output fval and also output the circumstances under which the algorithm terminated in output info As an example the norm of the residual function of each iterate can be plotted with Plot each iterate s objective function value dF dz Q7 z output nls gndl GF dF z0 semilogy 0 output iterations sqrt 2 output fval 39 7 COMPLEX OPTIMIZATION Since the objective function is F we plot sqrt 2 output fval to see the norm F r See the help information on the selected algorithm for more details 7 3 Unconstrained nonlinear optimization Algorithms Te
46. n Figure 8 2 the solutions of a relatively difficult bivariate system p x y q x y 0 are plotted The polynomials p x y and q x y are both of total degree 20 and have coefficients of which the exponents in base 10 range between 1 and 7 Figure 8 2 A system of bivariate polynomials p x y q x y 0 of total degree 20 The following example generates a random bivariate system computes its isolated solutions and plots the results p tril randn 5 q triu randn 5 options Plot true Note that plotting can take a while xy polysol2 p q options Acknowledgements We would like to acknowledge Mariya Ishteva and Nick Vannieuwenhoven for their important contributions to Tensorlab Additionally some code in Tensorlab is based on a modi fied version of GenRTR Generic Riemannian Trust Region Package by Pierre Antoine Absil Christopher G Baker and Kyle A Gallivan The Mor Thuente line search imple mented in 1s mt was translated and adapted for complex optimization from the original 46 REFERENCES Fortran MINPACK 2 implementation by Brett M Averick Richard G Carter and Jorge J Mor Laurent Sorber is supported by a doctoral fellowship of the Flanders agency for Innovation by Science and Technology IWT Marc Van Barel is supported by 1 the Research Council KU Leuven a OT 10 038 b PF 10 002 Optimization in Engineering OPTEC 2 the Research Foundation
47. n be exploited in its matrix vector product allowing for an efficient computation of an inexact step Concretely the user has the choice of supplying the matrix vector products J x and J y or the single matrix vector product J J x An implementation of an inexact nonlinear least squares solver using the former method can be function z lyap nls Jx A Q z0 dF dzx Jx z nls_gnd1 F dF zQ function F F z U e ales We zs F AxU V Ux V A 0 end function b Jx z x transp U z 1 V z 2 switch transp case notransp b J x Xu reshape x 1 numel U size U Xv reshape x numel U 1 end size V M M 38 7 COMPLEX OPTIMIZATION AxXu V Xux VA AU xXv U Xv A 5 b case transp b JEX X reshape x size A Bu A XxV X AxV Bv U A X U X A b Bu Bv end end end where the Kronecker structure of the Jacobian J lyap is exploited by reducing the computa tions to matrix matrix products Under suitable conditions on A and Q this implementation can achieve a complexity of O nk where n is the order of the solution X U V and k is its rank In the case of a nonanalytic residual function F z Z computing an inexact step requires matrix vector products J x J y Jo x and JP y where J Ove and Je v the residual function s Jacobian and conjugate Jacobian respectively For more information are on how to i
48. n conj V A xV A tmpb kron conj A U A U JHJ11 kron conj VxV AHA kron conj V AHAXV 1 tmpa tmpa JHJ22 kron I U x AHAXU kron conj AHA U U tmpb tmpb JHJ12 kron conj V AHAXU kron conj V A A U kron conj V A A U kron conj V AHA U JHJ JHJ11 JHJ12 JHJ12 JHJ22 end end By default the algorithm nls_gndl uses the Moore Penrose pseudo inverse of either J or J4 J to compute a new descent direction However if it is known that these matrices always have full rank a more efficient least squares inverse can be computed To do so use the option Compute a more efficient least squares step instead of using the pseudoinverse options JHasFullRank true z nls gndl Gf dF z0 options The other nonlinear least squares algorithms n1s gncgs and nls 1m use a different approach for calculating the descent direction and do not have such an option With an inexact step The most computationally intensive part of most nonlinear least squares problems is computing the next descent direction which involves inverting either the Jacobian J avee F or its Gramian J J in the case of an analytic residual function With iterative solvers such as preconditioned conjugate gradient PCG and LSQR the descent direction can be approximated using only matrix vector products The resulting descent directions are said to be inexact Many problems exhibit some structure in the Jacobian which ca
49. n of that cell array called Uhat with a signal to noise ratio of 20 dB 3 CANONICAL POLYADIC DECOMPOSITION 3 Canonical polyadic decomposition Lhe p br aR Figure 3 1 A canonical polyadic decomposition of a third order tensor The canonical polyadic decomposition CPD 2 7 10 approximates a tensor with a sum of R rank one tensors Let Ao B denote the outer product between an N dimensional tensor A and an M dimensional tensor B then Ao B is the N M dimensional tensor defined by Ao B i i jeri aiin 7 Biju For example let a b and c be nonzero vectors in R then ao b a bl is a rank one matrix and ao boc is defined to be a rank one tensor Let T be a tensor of dimensions 4 x l5 x x Iy and let U n be matrices of size x R then R TR M UC n oU 2 0 o o UNIC n r 1 is a CPD of T in R rank one terms A visual representation of this decomposition in the third order case is shown in Figure 3 1 3 1 Problem and tensor generation Generating pseudorandom factor matrices A cell array of pseudorandom factor matrices U U 1 U 2 corresponding to a CPD of a tensor of dimensions size tens in R rank one terms can be generated with size_tens 7 8 9 R 4 U cpd_rnd size_tens R By default cpd_rnd generates U n as randn size_tens n R Other generators can also be specified with an options structure e g options Real rand options Imag rand U cpd_rnd size_tens R opti
50. nf and sdf nls In the case of many missing entries the sdf_minf family is likely to perform best Their first output contains the optimized variables and factors in the fields variables and factors respectively Solve the SDF problem options Display 5 View convergence progress every 5 iterations sol sdf_nls model options sol variables sol factors Example 2 structured coupled matrix factorization Let X and Y be square matrices of order N that we wish to jointly factorize as U Ax V7 and U Ay WT respectively The factor U is common to both X and Y and the extent to which the columns are shared is given by the absolute values of the diagonal matrices Ax and Ay Furthermore we will require the elements of W to lie in the interval 3 5 and impose a Vandermonde structure on V so that 0 1 d vu vi Vy Vali ul d VN VN VN for some positive integer d Notice that V depends only on a so called generator vector T v fn gg vy First we ll create the matrices X and Y Generate structured coupled matrices X and Y 10 dip randn N R bsxfun Gpower randn N 1 0 R 1 Vandermonde factor rand N R 5 3 3 Elements in 3 5 lambdaX 0 3 X does not share first column of U lambdaY 3 1 0 Y does not share last column of U X Uxdiag lambdaX V Y Uxdiag lambdaY W Lu zu cm e zs ze M 21 6 STRUCTURED DATA FUSION Then we define a structure containi
51. ng the model s variables Define model variables model variables u randn N R model variables v randn N 1 model variables w randn N R model variables lambdaX randn 1 R model variables lambdaY randn 1 R Here the variables are the matrices U and W the generator vector v for the Vandermonde matrix V and the diagonals Ax and Ay The model assumes that X and Y share at most R 4 vectors in U Next we define the factors as transformed variables with Define the structure for V by creating an anonymous function which stores deg deg 0 R 1 vanderm z task struct_vander z task deg Define the structure for W by creating an anonymous function which stores rng rng 3 5 sigmoid z task struct_sigmoid z task rng Define model factors as transformed variables model factors U u model factors V v vanderm Create V as vanderm v model factors W w sigmoid Create W as sigmoid w model factors LX lambdaX model factors LY lambdaY In this example the transformations depend on parameters To pass along these parameters we encapsulate them inside anonymous functions For example the sigmoid transformation struct sigmoid requires the interval rng to constrain its argument in Now we add the two factorizations of X and Y to the model with Define the joint factorization of the matrices X and Y model factorizations xfac data X model factorizations xfac cpd
52. nother way to create a sparse tensor is to create a dense tensor in which at most 5 of the entries are nonzero For example T zeros 5 5 5 T 1 31 end 1 T fmt T creates a 5 x 5 x 5 diagonal tensor T with diagonal elements equal to 1 and formats it as a sparse tensor If not enough entries are zero then fmt T returns T itself To convert a sparse tensor back to a MATLAB array use ful T 2 2 Tensor operations 2 2 1 For dense tensors Matricization and tensorization A dense tensor T can be flattened into a matrix with M tens2mat T mode row mode col Let size tens size T then the resulting matrix M is of size prod size tens mode row by prod size tens mode col For example T randn 3 5 7 9 M tens2mat T 1 3 4 2 size M outputs 21 45 In tens2mat a given column row of M is generated by fixing the indices corresponding to mode col mode row and then looping over the remaining indices in the order mode row mode col To transform a matricized tensor M back into its original size size tens use mat2tens M size tens mode row mode col The most common use case is to matricize a tensor by placing its mode n vectors as columns in a matrix also called a mode n matricization This can be achieved by T randn 3 5 7 n 2 tens2mat T n Lu where the optional argument mode col is implicitly equal to 1 n 1 n 1 ndims T Tensor matrix product In a mode n tensor matrix product the tensor
53. nsorlab offers three algorithms for unconstrained complex optimization minf lbfgs minf lbfgsdl and minf ncg The first two are limited memory BFGS with Mor Thuente line search and dogleg trust region respectively and the last is nonlinear conjugate gradient with Mor Thuente line search Instead of the supplied Mor Thuente line search the user may optionally supply a custom line search method See the help information for details With numerical differentiation The complex optimization algorithms that solve minf require the scaled conjugate co gradient of the objective function f z Z which will be fiyap z Z for the remainder of this section cf Section 7 The second argument of the unconstrained nonlinear minimization algorithms minf_lbfgs minf lbfgsdl and minf ncg specifies how the gradient should be computed To approximate the scaled conjugate co gradient with finite differences set the second argument equal to the empty matrix An implementation for the Lyapunov equation could look like function z lyap_minf_deriv A Q zQ M G z frob A z 1 z 2 z 1 z 2 A 40 minf lbfgs f z0 end As with the nonlinear least squares algorithms the argument of the objective function is a cell array z consisting of the two matrices U and V Likewise the output of the optimization algorithm in this case minf_lbfgs is a cell array of the same format as the argument of the objective function f z With the
54. nstrained nonlinear least squares nls gndl nls gncgs and nls lm The first is Gauss Newton with a dogleg trust region strategy the second is Gauss Newton with CG Steihaug for solving the trust region subproblem and the last is Levenberg Marquardt A bound constrained method nlsb_gndl is also included and is a projected Gauss Newton method with dogleg trust region All algorithms are applicable to both analytic and nonanalytic residual functions and offer various ways of exploiting the structure available in its complex Jacobian With numerical differentiation The complex optimization algorithms that solve nls require the Jacobian of the residual function F z Z which will be ya U V for the remainder of this section cf Section 7 The second argument dF of the nonlinear least squares optimization algorithms nls_gndl nls gncgs and nls 1m specifies how the Jacobian should be computed To approximate the Jacobian with finite differences set dF equal to Jacobian or Jacobian C Ovec F az assuming F is analytic in z The second case Jacobian C corresponds to approximating the complex aa 8vec 7 where z C7 Jacobian consisting of the Jacobian 3 gt and conjugate Jacobian ast Since Fiyap does not depend on U or V we may implement a nonlinear least squares solver The first case Jacobian corresponds to approximating the Jacobian for the low rank solution of the Lyapunov equation as function z lyap nl
55. nt It is clear that f x fr x for x R and hence f is a generalization of fp to the complex plane Because f is now a function of both z and Z the limit limpo feb fe no longer exists in general and so it would seem a complex gradient does not exist either In fact this only tells us the function f is not analytic in z i e its Taylor series in z alone does not exist However it can be shown that f is analytic in z and z as a whole meaning f has a Taylor T series in the variables zc Iz ral with a complex gradient Of df 8z dzc Of az where ar and of are the cogradient and conjugate cogradient respectively The conjugate cogradient is a Wirtinger derivative and is to be interpreted as a partial derivative of f with respect to the variables z Z while treating the variables Z z as constant For the example above we have Of cos Z z Z z a Z a az A cos z z Z z a z a First we notice that pa 2f which holds for any real valued function f z Z A conse quence is that any algorithm that optimizes f will only need one of the two cogradients since the other is just its complex conjugate Second we notice that the cogradients evaluated in real variables z R are equal to the real gradient dr up to a factor 2 Taking these two observations into account the unconstrained nonlinear optimization algorithms in Tensorlab require only the scaled conjugate cogra
56. omprises optimization of real functions in complex variables 12 Tensorlab offers algorithms for complex optimization that solve unconstrained nonlinear optimization problems of the form minimize f z z minf zec where f C R is a smooth function cf minf_lbfgs minf lbfgsdl minf_ncg and minf sricgs and nonlinear least squares problems of the form Pte 1 minimize F z zZ 2 nls ZET 2 where E is the Frobenius norm and F C C is a smooth function that maps n complex variables to complex residuals cf nls_gndl nls_gncgs and nls 1m For nonlinear least squares problems simple bound constraints of the form VUE minimize z z subject to Re lt Re z lt Re u nlsb Im Im z Im u 29 7 COMPLEX OPTIMIZATION are also supported cf nlsb_gndl Furthermore when a real solution z R is sought complex optimization reduces to real optimization and the algorithms are computationally equivalent to their real counterparts Prototypical example Throughout this section we will use the Lyapunov equation A X X A Q 0 which has important applications in control theory and model order reduction as a prototyp ical example In this matrix equation the matrices A Q C are given and the objective is to compute the matrix X C Since the equation is linear in X there exist direct methods to compute X However these are relatively expensive requiring
57. ons and the inline equivalent U cpd_rnd size_tens R struct Real rand Imag rand generate U n as rand size_tens n R rand size_tens n R 1i Generating the associated full tensor Given a cell array of factor matrices U U 1 U 2 its associated full tensor T can be computed with 10 3 CANONICAL POLYADIC DECOMPOSITION T cpdgen U This is equivalent to M U 1 kr U end 1 2 size tens cellfun size U 1 T mat2tens M size tens 1 3 2 Computing the CPD The principal method To compute the CPD of a dense sparse or incomplete tensor T in R rank one terms call cpd T R For example Generate pseudorandom factor matrices U and their associated full tensor T size tens 7 8 9 R 4 U 1 cpd_rnd size_tens R cpdgen U 39 Compute the CPD of the full tensor T Uhat cpd T R generates a real rank 4 tensor and decomposes it Internally cpd first compresses the tensor using a low multilinear rank approximation see Section 4 if it is worthwhile then chooses a method to generate an initialization UO e g cpd gevd after which it executes an algorithm to compute the CPD given the initialization e g cpd_nls and finally decompresses the tensor and refines the solution if compression was applied Setting the options The different steps in cpd are customizable by supplying the method with an options structure see help cpd for more information e g options Disp
58. plus struct poly struct power Rectangular matrix with orthonormal columns DISP Matrix with columns as polynomials Array power struct rational Matrix with columns as rational functions struct rbf struct sigmoid struct sqrt struct sum Matrix with columns as sums of Gaussian RBF kernels Constrain array elements to an interval Square root Sum of elements 1 GETTING START ED struct_times struct_toeplitz struct_transpose struct_tridiag struct_tril struct_triu struct_vander UTUSURLES Clustering gap z kmeans Polynomials polymin polymin2 polyval2 polysol2 ratmin ratmin2 Statistics cum3 cum4 Scov Tensors dotk fmt frob ful kr kron mat2tens mtkronprod mtkrprod noisy sumk tens2mat tens2vec tmprod vec2tens Visualization slice3 spy3 surf3 voxel3 Times Toeplitz matrix Transpose Tridiagonal matrix Lower triangular matrix Upper triangular matrix Vandermonde matrix Optimal clustering based on the gap statistic Cluster multivariate data using the k meanst algorithm Minimize a polynomial Minimize bivariate and real polyanalytic polynomials Evaluate bivariate and univariate polyanalytic polynomials Solve a system of two bivariate polynomials Minimize a rational function Minimize b
59. s deriv A Q z0 F Q z Axz 1 z 2 z 1 Z 2 A Q z nls gndl F Jacobian z0 end The residual function F z is matrix valued and its argument is a cell array z consisting of the two matrices U and V The output of the optimization algorithm in this case nls_gndl is a cell array of the same format as the argument of the residual function F z With the Jacobian Using the property vec A X B BT amp A vec X it is easy to verify that the Jacobian of Fiyap U V is given by vec yap e vec U vec V eee Ie A U AeU lap To supply the Jacobian to the optimization algorithm set the field dF dz as the function handle of the function that computes the Jacobian at a given point z For problems for which the residual function F depends on both z and Z the complex Jacobian can be supplied with the field dF dzc See Section 7 1 or the help page of the selected algorithm for more information Applied to the Lyapunov equation we have function z lyap_nls_J A Q zQ dF dz QJ 36 7 COMPLEX OPTIMIZATION z nls_gnd1 F dF zQ function F F z US Aula V tei F AxU V U V A Q end function J J z U z 1 V z 2 I eye size A J kron V A kron conj A V 1 kron I A U kron conj A U M end end With the Jacobian s Gramian When the residual function F C Ch XIn js analytic in z i e it is not a function of Z and the number of residuals i
60. s large compared to the number of variables n it may be beneficial to compute the Jacobian s Gramian J J instead of the Jacobian J avec F itself This way each iteration of the nonlinear least squares algorithm no longer requires computing the pseudo inverse Jt but rather the less expensive pseudo inverse J J In the case of the Lyapunov equation this can lead to a significantly more efficient method if the rank k of the solution is small with respect to its order n Along with the Jacobian s Gramian the objective function f 1 2 and its gradient of J vec F are also necessary Skipping the derivation of the gradient and Jacobian s Gramian the implementation could look like function z lyap_nls_JHJ A Q zQ AHA A xA dF JHJ JHJ dF JHF grad z nls gndl Gf dF z0 function f f z U e zs We zi F AxU V Ux V A 0 f 5 F F end function g grad z We 201s VELEZ 20 gU AHAx Ux V V A UX VXA XV A QKV 4 Ax Ux VXAXV U VAAHAXV Q AXV gV U xAHAXU V U xA xU xV xA U xA xQ U xAxU V XA U xU xV XAHA U XQ xA gS aU aes end function JHJ JHJ z lin the more general case of a nonanalytic residual function the structure in its complex Jacobian can be exploited by computing an inexact step See the following paragraph for more details 37 7 COMPLEX OPTIMIZATION U z 1 V z 2 I eye size A tmpa kro
61. struct_abs struct_band struct cell2mat struct conj RANK APPROXIMATION Low multilinear rank approximation LMLRA by higher order orthogonal iteration LMLRA by unconstrained nonlinear optimization LMLRA by nonlinear least squares LMLRA by a differential geometric Newton method LMLRA by a Riemannian trust region method Truncated multilinear singular value decomposition LMLRA by adaptive cross approximation Pseudorandom initialization for LMLRA Errors between factor matrices in a LMLRA Generate full tensor given a core tensor and factor matrices Residual of a LMLRA Multilinear rank Estimate multilinear rank FUSION Structured data fusion by unconstrained nonlinear Structured data fusion by nonlinear least squares Absolute value Band matrix Convert the contents of a cell array into a matrix Complex conjugate struct ctranspose Complex conjugate transpose struct diag struct gram struct hankel Diagonal matrix Gramian matrix Hankel matrix struct inv Matrix inverse struct invsqrtm Matrix inverse square root struct invtransp Matrix inverse transpose struct_LL1 Structure of the third factor matrix in a rank Lr Lr 1 BTD struct log Natural logarithm struct matvec struct nonneg Matrix vector and matrix matrix product Nonnegative array struct normalize Normalize columns to unit norm struct orth struct
62. t Download amino mat in this directory load amino X size core mlrankest X Optimal core tensor size at L curve corner U S Imlra X size core Additionally the compression and relative error of other choices of size core can be viewed by using the figure s Data Cursor tool se a ad T T c1 T Core tensor size 5 6 22 e Compression 5 2736 Relative error 0 01274 DE a c s oM D E m c m 10 F g 2 L curve corner at 5 3 3 10 Upper bound on error T MLSVD error x x I x nu ae aL 32 1 0 10 10 10 numel U numel S numel T Figure 4 2 The mlrankest tool for choosing the core tensor size size_core in a LMLRA 16 5 BLOCK TERM DECOMPOSITION 5 Block term decomposition Figure 5 1 A block term decomposition of a third order tensor A block term decomposition BTD 10 17 approximates a tensor by a sum of low multilinear rank terms Let T be a tensor of dimensions 4 x Ip x x y and let Ufr represent the rth block term in the BTD More specifically let U r N 1 be a core tensor of dimensions Jo x Je Xxx Jo and let Utr n be matrices of size I x JS X lt forl lt n lt N then R TR So UCr N 1 e1 UCr 1 2 UL 2 03 gt en ULF HN r 1 is a BTD of the tensor T in the block terms Uf r A visual representation of this decomposition in the third order case is shown in Figure 5 1 5 1 Problem and tensor generation
63. t of this example is ratmin2 randn 6 randn 4 struct Plot true 44 8 GLOBAL MINIMIZATION OF BIVARIATE FUNCTIONS Polyanalytic polynomials and rational functions Polyanalytic polynomials p z Z are represented by a matrix p as aoo Od pz 1 Zo Z ado addy ay For example the polynomial p z Z 1 2z 3z AZ bzZ 6Z z TZ is represented by the matrix o I 2 ae 456 700 However minimizing a polyanalytic polynomial p z Z only makes sense if p z Z is real valued for all z C A polyanalytic polynomial is real valued if and only if its matrix representation is Hermitian i e p p As with bivariate polynomials the stationary points of a real valued polyanalytic polynomial p z Z can be computed with polymin2 p As an example the stationary points of a pseudorandom real valued polyanalytic polynomial can be computed with p rand 6 rand 6 1i p pxp options Plot true xy polymin2 p options p z z Computing the stationary points of a polyanalytic rational function a z Z is completely analogous to the polynomial case For example p rand 6 rand 6 1i p q rand 4 rand 4 li q options Plot true Xy 7 ratmin2 p q options pxp q q Remark f the matrix p contains complex coefficients and is Hermitian polymin2 will treat p as a real valued polyanalytic polynomial Otherwise it will be treated as a bivariate polynomial In some cases it
64. tions Uhat n computed by 1mlra is also a measure of the approximation error The function lmlraerr computes the subspace angle between the given two sets of factor matrices In other words sangle lmlraerr U Uhat returns a vector in which the nth entry is subspace U n Uhat n 15 4 LOW MULTILINEAR RANK APPROXIMATION 4 3 Choosing the size of the core tensor For dense tensors use the mlrankest tool to help determine the size of the core tensor size core Running mlrankest T plots an L curve which represents the balance between an upper bound 5 on the relative error of a LMLRA of T for different core tensor sizes and the LMLRA s compression ratio The following example applies mlrankest on the amino acids dataset 1 url http www models life ku dk go engine filename amino mat urlwrite url amino mat Download amino mat in this directory load amino X mlrankest X The resulting figure is shown in Figure 4 2 The corner of this L curve is often a good estimate of the optimal trade off between accuracy and compression The core tensor size size core corresponding to the L curve corner is marked on the plot with a square and is also mlrankest s first output By capturing it as in size core mlrankest X the L curve is no longer plotted All together a LMLRA of the amino acids dataset can be computed in only a few lines url http www models life ku dk go engine filename amino mat urlwrite url amino ma
65. where the second operand represents the Kronecker product U end U n 1 Q U n 1 GU 1 The function mtkronprod T U n computes the result of this operation without explicitly computing either of the operands and without permuting the elements of the tensor T in memory Another useful way of interpreting this product is as the result of mode 1 n 1 n 1 length U Ut cellfun transpose U UniformOutput false tens2mat tmprod T Ut mode mode n Matricized tensor Khatri Rao product Similarly algorithms for computing tensor decom positions usually don t explicitly need matricized tensors or Khatri Rao products but rather the matricized tensor Khatri Rao product tens2mat T n kr U Lend 1 n 1 n 1 1 1 The function mtkrprod T U n computes the result of this operation without explicitly computing either of the operands and without permuting the elements of the tensor T in memory Another useful way of interpreting the rth column of this product is as the result of mode 1 n 1 n 1 length U Urt cellfun u u r U UniformOutput false tens2mat tmprod T Urt mode mode n Creating noisy versions of cell arrays The noisy function can be used to create a noisy version of a MATLAB array of a nested cell array of arrays and of incomplete or sparse tensors For example U randn 10 2 randn 15 2 randn 20 2 SNR 20 Uhat noisy U SNR creates a cell array U and a noisy versio
66. z Jacobian Functions that are analytic when their argument is real may no longer be analytic when their argument is complex For example F z zz Re z z is not analytic in z C because it depends on Z but is analytic when z R An example of a function that is analytic for both real and complex z is the function F z The following two examples compute the relative error of the finite differences approximation of the Jacobian J x in a real vector x Approximate the Jacobian of an analytic tensor valued function with finite differences and compute its relative error x randn 10 1 relerr1 frob J1 x deriv F1 x 1 Jacobian frob J1 x 34 7 COMPLEX OPTIMIZATION and the relative error of the finite differences approximation of J z in a complex vector Zz Approximate the Jacobian of an analytic tensor valued function with finite differences and compute its relative error z randn 10 1 randn 10 1 1i relerr1 frob J1 z deriv F1 z Jacobian frob J1 z Nonanalytic vector valued functions with finite differences In general a tensor valued function may be function of its argument and its complex conjugate The matrix valued function log trace X Y 0 al X 4 X a a Y Y a Fo X Y where a R and X Y C is an example of such a nonanalytic function because it depends on X Y and X Y Its Jacobian and conjugate Jacobian are given by 0 EU
67. ze_tens size_core options and its inline equivalent U S lmlra_rnd size_tens size_core struct Real rand Imag rand Generating the associated full tensor Given a cell array of factor matrices U U 1 U 2 and core tensor S its associated full tensor T can be computed with T Imlragen U S This is equivalent to T tmprod S U 1 length U 4 2 Computing a LMLRA The principal method To compute a LMLRA of a dense sparse or incomplete tensor T with a core tensor of size size core call Imlra T size core For example Generate pseudorandom LMLRA U S and associated full tensor T size tens 17 19 21 size core 3 5 7 U S lmlra rnd size tens size core T Imlragen U S Compute a LMLRA of a noisy version of T with 20dB SNR Uhat Shat lmlra noisy T 20 size_core generates a real rank 3 5 7 tensor and computes its low multilinear rank approximation Internally Imlra chooses a method to generate an initialization UO and S0 e g 1mlra aca after which it executes an algorithm to compute the LMLRA given the initialization e g lmlra nls 14 4 LOW MULTILINEAR RANK APPROXIMATION Setting the options The two steps in 1mlra are customizable by supplying the method with an options structure see help Imlra for more information e g options Display true Show progress on the command line options Initialization Glmlra rnd Select pseudorandom initializat
68. zed eigenvalue problem 11 15 Stationary points of bisys bivariate polynomials and rational functions may be computed with polymin2 and ratmin2 respectively Polyanalytic univariate polynomials Closely related is the problem of minimizing a polyanalytic univariate polynomial minimize p z Z unipol zcec or more generally a rational function Vu Z Z minimize p nimi q z zy unirat Analogously to the analytic bivariate case all local minimizers are roots of the system Op q _ 8745 Pas 9 P asp oun go Paz where the derivatives are Wirtinger derivatives cf Section 7 1 The method polysol2 can also solve systems of polyanalytic polynomials f z Z 20 unisys g z z 0 In fact given a system of bivariate polynomials bisys polysol2 will first convert it to the form unisys before computing the roots as the eigenvalues of a complex generalized eigenvalue problem Stationary points of real valued polyanalytic polynomials and rational functions may be computed with polymin2 and ratmin2 respectively 42 8 GLOBAL MINIMIZATION OF BIVARIATE FUNCTIONS 8 1 Stationary points of polynomials and rational functions Polynomials and rational functions In MATLAB a polynomial p x is represented by a row vector p ad a2 al a0 as z p x as e BB dj a ew X x 1 For example the polynomial p x x 2x 3x 4 is represented by the row vector p 1 2 3 4 Its derivative gp can be

Tensorlab User Guide

Contents

Download Pdf Manuals

Related Search

Related Contents