Faster solver for Sylvester-like systems

Preliminaries

The operators of a tensor can provide a wide lens into its structure. In particular, the operators can distinguish two tensors under the equivalence relation of change of bases. Throughout, we suppose $t$ is a $(d\times d\times d)$-grid of elements of some field $K$. For $1\leq i,j,k\leq d$, we write $t_{ij}^k\in K$ for the $(i,j,k)$-entry of $t$. Then $t$ can be interpreted as a $K$-multilinear map $V\times V\rightarrowtail V$, where $V=K^d$. Taking this interpretation a bit further, we can evaluate $t$ at $(u, v)\in V\times V$, and we write $\langle t \mid u, v\rangle \in V$ for this evaluation.

The ring of adjoint operators of $t$ is the $K$-algebra

\[\begin{aligned} \mathrm{Adj}(t) &= \left\{(X, Y) \in \mathbb{M}_d(K)\times\mathbb{M}_d(K) ~\middle|~ \forall u,v \in V,\; \langle t \mid uX^\dagger, v\rangle = \langle t \mid u, Yv\rangle \right\}, \end{aligned}\]

where $^\dagger$ denotes the matrix transpose. There is a naive, super-quadratic-time algorithm to compute the adjoint operators of $t$.

Lemma. 1 There exists a linear system of $2d^2$ variables and $d^3$ equations whose kernel determines a basis for $\mathrm{Adj}(t)$. This is solved using $O(d^7)$ field operations.

This computation is the bottleneck of many computations aimed at the isomorphism problem for tensors. In some nice cases, we can construct a basis for $\mathrm{Adj}(t)$ in nearly linear time. Before we discuss special cases, we need to discuss special cases of slicing tensors. These have general descriptions for general tensors, but we keep the discussion to what is necessary for bilinear maps.

For $1\leq i,j\leq d$, a $1$-slice of $t$ is the tensor $\langle t\mid\cdot, e_j \rangle : V \rightarrow V$, and a $2$-slice of $t$ is $\langle t\mid e_i, \cdot \rangle : V \rightarrow V$. A $0$-slice is slightly different, resulting in a bilinear form $V\times V\rightarrowtail K$. These $\ell$-slices correspond to matrices in the $(d\times d\times d)$-grid. For example, a $2$-slice for $i=1$ corresponds to the matrix whose $(j,k)$-entry is $t_{1j}^k$ for $1\leq j,k\leq d$.

For $1\leq k\leq d$, denote the $0$-slice of $t$ at $e_k$ by $t^k$, so $\langle t^k\mid : V\times V\rightarrowtail K$. The defining equations for $\mathrm{Adj}(t)$ can be divided into $d$ different equations based on the $0$-slices.

\[\begin{aligned} \langle t \mid uX^\dagger, v\rangle &= \langle t | u, Yv\rangle &\Longleftrightarrow & & (\forall k) \quad \langle t^k \mid uX^\dagger, v\rangle &= \langle t^k \mid u, Yv\rangle \end{aligned}\]

If $T^k$ is the corresponding matrix from the $0$-slice $t^k$, then each of these $d$ equations can be represented as a Sylvester-like system:

\[\begin{aligned} \langle t^k \mid uX^\dagger, v\rangle &= \langle t^k \mid u, Yv\rangle & \Longleftrightarrow & & XT^k - T^kY &= 0. \end{aligned}\]

The Sylvester equation determines $T^k$ given the matrices $X$ and $Y$.

Theorem 1. [Maglione–Wilson (2018)] There exists an algorithm that, given a $(d\times d\times d)$-grid with an invertible $1$- and $2$-slice, constructs a basis for $\mathrm{Adj}(t)$ using $O(d^4)$ field operations.

The key idea of Theorem 1 is to consider the structure of the linear system in Lemma 1 in terms of slices. Because of the invertibility conditions, we can clear most of the entries in the system. What remains is a homogeneous system whose solution is given by a formula in terms of certain slices.

Problems

We propose the following questions as a gradual approach to more general solutions. We think the approach taken in Theorem 1 might lead to an affirmative answer to the first question.

Problem 1. Does there exist a sub-quadratic-time or quadratic-time algorithm to construct a basis for $\mathrm{Adj}(t)$, given a $(d\times d\times d)$ tensor $t$?

We suspect that an affirmative answer to Open Problem 1 would apply immediately to general tensors and general nuclei, improving the computational complexity in all cases. In addition, computations of centroids should also benefit from such an affirmative answer as they are, in some sense, the intersection of nuclei. It is not clear if such a solution would apply to derivations. The derivation algebra of a $(d\times d\times d)$-grid is

\[\begin{aligned} \mathrm{Der}(t) &= \left\{ (X, Y, Z) \in \mathbb{M}_d(K)^{\oplus 3} ~\middle|~ \forall u, v\in V,\right. \\ & \qquad\qquad\qquad \left. \langle t | uX, v\rangle + \langle t \mid u, vY\rangle = \langle t \mid u, v\rangle Z \right\}. \end{aligned}\]

The proof of Lemma 1 applies to $\mathrm{Der}(t)$, and so computing $\mathrm{Der}(t)$ has the same complexity.

Problem 2 Does there exist a sub-quadratic-time or quadratic-time algorithm to construct a basis for $\mathrm{Der}(t)$, given a $(d\times d\times d)$ tensor $t$?

An affirmative answer to Open Problem 2 implies an affirmative answer to Open Problem 2. Assuming such an answer extends to general tensors, this implies that every operator set of a tensor $t$ whose annihilator is generated by linear polynomials can be constructed more efficiently than the naive approach.

Additional Questions?

If you have additional questions, feel free to reach out to a maintainer / contributor on the contact page.


Shrunk subspaces of alternating matrix tuples

Let $p\geq 3$ be a prime. Let $\Lambda(n,p)$ be the set of alternating matrices over...

Min-Rank

Takunari Miyazaki (Trinity College, Har...

Tensors
Algebras
Filters
Solvers
Identity
Resources