## Abstract

An introduction to the Hilbert spaces that are endowed with a reproducing kernel is presented on using the mathematical tools of Fourier optics and coherence theory. After giving the basic definition of such spaces, some examples are worked out to show how the inner product can take different forms depending on the particular function space one works with. The basic rule to build a reproducing kernel Hilbert space (RKHS) is then presented together with the basic properties of those spaces. Eigenfunctions and eigenvalues of the reproducing kernel are then illustrated and lead to the important integral representation of the reproducing kernel. The latter is used to present pseudomodal expansions and generalized forms of sampling. The concluding section offers some thoughts on the applications of RKHSs in wave optics. An appendix presents an introduction to treatments using more advanced concepts of functional analysis.

© 2021 Optical Society of America

## 1. INTRODUCTION

In recent years, concepts and methods derived from the theory of reproducing kernel Hilbert spaces (RKHSs) have found applications in coherence theory as well as in coherent optical processing. The theory of RKHSs itself though remained scarcely appreciated. A better knowledge of such a theory would help to widen horizons and to suggest further applications. Before delving into specific textbooks on RKHSs, a scientist working in optics can be aided by an introduction where the basic concepts are presented, keeping the mathematics at a level comparable to those needed for Fourier optics and coherence theory. The tutorial to follow aims at this kind of presentation. Throughout the paper, examples are given (generally, in the last part of each section) to help readers get a better grasp of the subject. Introductory elements of functional analysis are collected in a dedicated appendix as an aid for readers interested in deepening their understanding of the subject with more advanced papers.

## 2. SOME FAMILIAR CASES

Before defining RKHSs, it is worthwhile to discuss some cases of these spaces that are well known by virtue of Fourier optics [1,2] and coherence theory [3]. Let us begin with the Hilbert space, say, ${\cal H}$, of Fourier transformable band-limited functions. Denoting by $f(x)$ a typical function of this set and by $\tilde f(\nu)$, or by ${\cal F}\{f(\cdot)\} (\nu)$, its Fourier transform, i.e.,

*reproduces*$f(x)$. Let us use the notation and assume the inner (or scalar) product of two typical functions $f(y)$ and $g(y)$ in ${\cal H}$, to be denoted by ${\langle f(\cdot),g(\cdot)\rangle _{\cal H}}$, has the well-known expression where the star denotes the complex conjugate. Then, Eq. (5) can be written as Accordingly, we can say that in the present ${\cal H}$ space there exists a

*reproducing kernel*$K(x,y)$ such that Eq. (9) holds true. The above property sounds evident in the Fourier domain which is again Eq. (3) written with a different symbol for rect. This is a first example of Hilbert space endowed with a

*reproducing*kernel $K$. The characterizing property is expressed by Eq. (9).

Band-limited functions exhibit certain regularity features. For example, they cannot take on arbitrarily large values or vary at will passing from a point to another. In fact, the following upper bounds can be established [2] $\forall x$:

where $D$ is a real constant and Furthermore, for such functions the well-known sampling theorem holds [1,2].As another example we can consider the set of trigonometric polynomials with a finite number of Fourier components. Assume the period to be $P$, and let the polynomials possess a number $2N + 1$ of Fourier components with indexes between ${-}N$ and $N$. In other words, consider the set of trigonometric polynomials expressed by the formula

where The scalar product of two typical functions $f$ and $g$ over one period gives where the last expression is obtained on writing $f$ and $g$ as sums of the form in Eq. (14) and exploiting the orthonormality of the exponential functions. With this scalar product, the set of functions specified by Eq. (14) constitutes a RKHS having a reproducing kernel given by the Dirichlet kernel [4] well known from the theory of Fourier series. The check that $K$ is a reproducing kernel is easily performed by taking into account the following expansion, and using again the orthonormality of the exponential functions.Our last example deals with the well-known system of orthonormal polynomials ${P_n}(x)$ with $n = 0,1,\ldots$ defined on a set $X$ (e.g., Jacobi, Hermite, Laguerre) [5,6]. A typical polynomial $p(x)$ of order $N$ can be expressed as a linear combination of ${P_n}$ with ($n = 0,1,\ldots,N$), i.e.,

where the typical ${c_m}$ coefficient is evaluated as the inner product of $p(x)$ with the $m$th polynomial ${P_m}(x)$, defined as [5,6]On inserting Eq. (19) into Eq. (20), and exploiting the orthonormality of the ${P_n}$, we have

Let us now define the kernel

and evaluate the inner productIt is worthwhile to note that the kernel ${K_N}(x,y)$ can be given a closed form through the Christoffel–Darboux formula [5,7,8]

Each of the previous examples specifies a Hilbert space endowed with its own reproducing kernel and its own inner product.

## 3. DEFINING REPRODUCING KERNEL HILBERT SPACES

We now give the definition of a RKHS, meaning a Hilbert space endowed with a reproducing kernel. Let ${\cal H}$ be a Hilbert space formed by functions, say, $f(x)$, defined on a set $X$. (From now on we shall follow the convention of mathematical papers on reproducing kernels where scalar symbols, like $x,y,\ldots$ may even refer to vectorial quantities. Furthermore, dimensional parameters will be omitted except for some examples.) It is then defined an inner or scalar product between two typical elements $f$ and $g$. It will be denoted by

and it will induce a norm It is worthwhile to recall that the properties of the inner product are the following (using a synthetic notation and assuming they hold $\forall f,g,h \in {\cal H}$, $\forall \alpha ,\beta \in {\mathbb C}$):The space is said to be a RKHS [4,7,9–12] if there exists a kernel $K(x,y)$, defined over $X \times X$, such that

*reproducing kernel*. Property $B)$ accounts for the name given to ${\cal H}$. We see that, in the spaces under consideration, $K$ plays a role similar to that of a Dirac delta function. One may wonder whether the Dirac delta itself can be used as a reproducing kernel. The answer is negative because of condition $A)$. In fact, $K(\cdot ,x)$ must be a function belonging to ${\cal H}$, whereas $\delta$ is a distribution [4,7]. (Extensions to theories where the $\delta$ can be included exist [7].)

By virtue of condition $A)$, $f(\cdot)$ can be replaced by $K(\cdot ,x)$ so that, using condition $B)$, we have

In particular, for $x = y$ we obtain Note that if $c$ is a complex constant, we have and Furthermore, we have [see Eq. (27)]Let us now show that the reproducing kernel is non-negative definite, i.e., that once fixed at an arbitrary number $N$ of points ${x_1},{x_2},\ldots,{x_N}$ and of complex constants ${c_1},{c_2},\ldots,{c_N}$, the quadratic form $Q$, defined as follows,

satisfies the condition or equivalently that the matrix $\{K({x_i},{x_j})\} _{i,j = 1}^N$, known as the Gram matrix [4], is non-negative. In fact, on exploiting Eq. (34), we can write*principal diagonal*.

A fundamental theorem is that of Moore–Aronszajn [7,9,10]. The theorem states that to any RKHS a non-negative definite reproducing kernel is associated (as we have just seen), and, conversely, that to any non-negative definite kernel $K$ we can associate a RKHS for which $K$ is the reproducing kernel.

Something has to be added about terminology. Many authors use the term *positive definite* for a kernel satisfying Eq. (41) and specify *strictly positive definite* if Eq. (41) holds for inequality only. Others speak of a *positive semi-definite* kernel, limiting the use of *positive definite* to the case in which Eq. (41) holds for inequality only. In this paper, we shall use the term *non-negative* when Eq. (41) holds and the term *positive* if Eq. (41) holds for inequality only.

## 4. EXAMPLES

Before considering general properties of RKHSs, it is worthwhile to see a few examples.

#### A. Shift-Invariant Kernels

Let us consider a typical shift-invariant kernel $K(x - y)$. Particular cases were already considered in Section 2. We shall now refer to a more general case. We recall Bochner’s theorem [13], according to which a function $K(x - y)$ is non-negative definite if and only if the Fourier transform of $K(\cdot)$ is real non-negative. For simplicity, we assume $\tilde K(\nu)$ to be positive within a given interval (possibly of infinite extent). In general, Eq. (10) does not hold, and $K$ is not a reproducing kernel. However, by virtue of Moore–Aronszajn theorem, a reproducing kernel exists. We can find it on exploiting the definition of the inner product. To this end we define the inner product of two Fourier transformable functions $f(x)$ and $g(x)$ as follows:

For a simple example, let us consider the kernel

whose FT is the non-negative function [1,2] According to Eq. (45), the inner product of two functions $f$ and $g$ is then#### B. Real Functions in [0,1]

Let us consider the set of *real* functions that are continuous in the interval [0,1], vanishing at the origin and endowed with continuous first derivative in [0,1] (possibly except for a finite number of points). We assume that on such a set the following kernel is acting,

Since the derivative of $K$ with respect to $x$ equals 1 for $0 \le x \le y$ and 0 elsewhere, we easily conclude that $U$ is nothing else than the first derivative. Then the inner product between two functions $f$ and $g$ has to be defined as

Again, it could be proved that Eq. (56) defines a bona fide inner product.For a further example, we take the space of real continuous functions defined in the interval [0,1], there possessing a continuous first derivative except at a finite number of points and vanishing at the end points. Using again Eq. (56) as a definition of inner product, it is easy to check that the space ${\cal H}$ thus obtained is a RKHS, whose reproducing kernel is given by

This kernel equals $x(1 - y)$ for $x \le y$ and $y(1 - x)$ for $x \ge y$. Then, for any $y$, the graph versus $x$ has a triangular shape. For this reason, the kernel is called*triangular*[14].

#### C. Szegö’s Kernel

Both this example and the one to follow show the use of reproducing kernels for functions of complex variables. Let $f(z)$ and $g(z)$ be two functions of the complex variable $z = r{e^{{i\theta}}}$ representable as a convergent power series for $r \le 1$:

Since ${z_1},{z_2}$ are on the unit disk of the complex plane, the sum defining $K$ converges to

which is Szegö’s kernel [4].#### D. Bergman’s Kernel

Let $f(z)$ and $g(z)$ be two functions of the complex variable $z = r{e^{{i\vartheta}}}$, expressible as the convergent series for $r \le 1$:

## 5. CONSTRUCTION OF RKHSS

Apart from the ad hoc procedures exemplified in Section 4, a general problem to be faced is: Given a non-negative definite kernel $K(x,y)$, how do we construct the associated RKHS?

Let us present the standard approach of Aronszajn based on linear combinations of kernel functions. Take $N$ points ${x_1},{x_2}, \ldots ,{x_N}$ and $N$ complex constants ${a_1},{a_2}, \ldots ,{a_N}$, where $N$ is an integer. Consider now functions of the form

The inner product of two functions $f$ and $g$ (with coefficients ${b_n}$) is, by virtue of the linearity of the inner product,The space built in this way is not complete [9] (it is called a pre-Hilbertian space). It is dense, though [9], in the Hilbert space obtained by completion on adding the limits of the Cauchy sequences converging in norm and taking into account that, as will be seen in Section 6, the latter implies pointwise convergence.

Let us pause for a moment to put into evidence that the construction of a RKHS by means of a combination of kernels centered at distinct points, as in Eq. (73), gives an idea of the regularity properties of the functions belonging to the RKHS. Apparently, they are the same as those of the kernel itself. For a limiting example, let us refer to the kernel $K(x,y) = xy$ on the domain $[ 0,1 ] \times [ 0,1 ]$ whose non-negativeness is easily proved. It is clear that, no matter how the quantities ${a_n}$ and ${x_n}$ are chosen in Eq. (73), the resulting function will be of the form $ax$, where $a$ can possibly vanish. Hence, the RKHS associated to the present kernel includes one type of function only.

## 6. GENERAL PROPERTIES OF RKHSS

The examples of Section 4 could give the idea that we would have found different reproducing kernels on changing the search procedure. This is not the case. In fact, we now show that the reproducing kernel is unique. Arguing by contradiction, imagine that, in a typical RKHS endowed with a reproducing kernel $K$, another reproducing kernel, say, ${K_1}(x,y)$, exists. Then for some $y$ it should be

On the other hand, the left-hand side can be writtenLet us now recall that in any Hilbert space the Cauchy–Schwarz inequality holds:

On applying this inequality to the reproduction relation from Eq. (33) and using Eq. (35), we obtain which shows that the values of $|f(y)|$ are limited by the norm of $f$ and by the values of $K$ along the principal diagonal.Let us note that Eq. (80) entails $f(y) \equiv 0$ if $||f|| = 0$. We further note that on replacing $f$ with $K$ in Eq. (80) we have

Let us add a property about convergence in RKHSs. Let ${f_n}(\cdot)$ be a sequence that converges in norm. On applying the reproduction relation and inequality from Eq. (79), we find for a typical pair $n, m$

Let us finally show that the variations of a typical function belonging to a RKHS are limited by the kernel’s features. To this end, we consider the modulus of the difference of the values of $f$ at two points, say, $x$ and $y$. Taking the reproduction property into account we can write

The inequalities from Eqs. (80) and (89) generalize to any RKHS the limitations from Eqs. (11) and (12) that held specifically for band-limited functions.

It is rather spontaneous to wonder whether different RKs can be somehow combined to give new kernels. The simplest of such operations is the sum. The following theorem holds true [4]: “Let ${K_1}$ and ${K_2}$ be RKs for spaces ${{\cal H}_1}$ and ${{\cal H}_2}$, respectively, of functions on $X$ with corresponding norms ${||.||_{{{\cal H}_1}}}$ and ${||.||_{{{\cal H}_2}}}$. Then $K = {K_1} + {K_2}$ is the RK of the space of functions $f = {f_1} + {f_2}$ where ${f_1} \in {{\cal H}_1}$ and ${f_2} \in {{\cal H}_2}$ with the norm $||f||_{\cal H}^2 = {\min}\; (||{f_1}||_{{{\cal H}_1}}^2 + ||{f_2}||_{{{\cal H}_2}}^2$).”

The difference of two RKs is not necessarily a genuine RK, so a specific analysis is to be done case by case [15,16].

Lastly, a RK remains valid if multiplied by a constant and even if multiplied by a different (valid) RK.

The above hints can give an idea of how RKs can be extended by simple operations.

## 7. EIGENFUNCTIONS AND EIGENVALUES

Here we shall give the basis of an approach to RKHSs that differs from that used in Section 5.

The pertinent treatment is based on the homogeneous Fredholm integral equation of the second kind [13,14]:

where $D$ specifies the integration domain while $\Phi$ and $\lambda$ denote an eigenfunction and the corresponding eigenvalue. As we shall see such quantities play an important role in RKHSs.Under suitable conditions, in particular if the following inequality,

is met, a discrete set of eigenfunctions ${\Phi _n}$ and eigenvalues ${\lambda _n}$ is found [13]. The eigenfunctions are (or can be made if degenerate eigenvalues exist) mutually orthogonal and normalized. Since the reproducing kernel $K$ is non-negative definite the eigenvalues ${\lambda _n}$ are non-negative. As a rule the eigenvalues are ordered in a non-increasing way.A fundamental result is given by Mercer’s theorem [13], expressing the kernel $K$ as the following series in $x$ and $y$ (for the so-called Pincherle–Goursat kernels, the series is replaced by a sum [14]):

where the convergence of the series is uniform with respect to both $x$ and $y$. Since the eigenfunctions are often interpreted as*modes*associated to $K$, the equality from Eq. (93) is also known as the

*modal expansion*of the kernel.

Let us now discuss the role of eigenfunctions and eigenvalues in the inner product operation. Given a typical eigenfunction ${\Phi _j}$, the reproduction property holds (as for any function of the space),

or, according to Eq. (93),Let us now take two functions $f(x)$ and $g(x)$ and expand them in eigenfunction series:

Their inner product in ${\cal H}$ isIt is easy to check, by using this expression of the inner product, that $K$ is the reproducing kernel. Let us, in fact, evaluate the inner product between $f(x)$ written according to Eq. (100) and $K(x,y)$ written according to Eq. (93). In the latter, the coefficient role is played by the terms ${\lambda _n}\Phi _n^*(y)$. Then, using Eq. (103), we find

Let us add a qualitative remark. According to Eq. (103), the squared norm of $f$ is given by

*smaller*than ${L^2}$.

It will be noted that the first example of RKHS we saw, i.e., the one dealing with band-limited functions in (${-}{\nu _M},{\nu _M}$), whose reproducing kernel was $2{\nu _M}$ sinc[$2{\nu _M}(x - y)$], was not discussed in terms of eigenfunctions. This is because for such a kernel Eq. (92) does not hold. This is also true for all the shift-invariant kernels. Limiting ourselves to an intuitive hint, for all such kernels the eigenfunctions form a continuous set. This can be easily seen on writing the pertaining integral equation

and passing to the Fourier domain. Then the convolution theorem gives For a typical kernel $K$, Eq. (108) has Dirac delta functions as solutions. More explicitly, letting with an arbitrary ${\nu _0}$, Eq. (108) becomesJust to give an example, let us see how eigenfunctions and eigenvalues are evaluated for the kernel (Section 4)

acting in the interval $(0,1)$ with the condition that $\Phi (x)$ vanishes in $x = 0$. To see how the integral equation is solved, note that the explicit form of Eq. (91) reads Differentiating with respect to $x$, we obtain and with a further differentiation Hence, the solutions of the harmonic oscillator equation come into play. Since the searched solutions have to vanish for $x = 0$, they are of the form where the normalizing factor $A$ and $\alpha$ are to be determined. On inserting Eq. (115) into Eq. (112), we easily find as well as $A = \sqrt 2$. All the eigenvalues are positive, and this implies that the kernel is positive definite [13,14]. Using now the Mercer theorem [Eq. (93)], we obtain## 8. INTEGRAL REPRESENTATION

In this section we will be examining an integral representation of reproducing kernels. It was developed by Parzen [16] and imported into optics much later [17,18]. Let us consider kernels of the form

where $w$ is non-negative over the integration domain $T$. The function $A$ is assumed to be square summable over $T$. It is easily seen that $K$ is non-negative definite. To this end, let us insert Eq. (118) into Eq. (40). This leads to as predicted. Actually, in Eq. (119) the strict inequality sign holds (thus giving the kernel a strictly positive definite nature) unless the functions $A(t,{x_i})$ are linearly dependent [16].It is found [16,19] that the space ${\cal H}$ associated to $K(x,y)$ is that of the functions $f(x)$ obtained through the linear transformation ${\cal L}$,

where $F$ belongs to the closure of the linear span of $A$ and is uniquely associated to $f$. (A linear combination of an arbitrary number of functions $A(t,{x_i})$, where ${x_i}$ are distinct, fixed values of $x$.) Notice that, for given $y$, $K$ itself, see Eq. (118), has this structure [replacing $F(t)$ with $A(t,y)$], i.e., We assume ${\cal L}$ to be invertible and denote by ${{\cal L}^{- 1}}$ the inverse operator, so that we have We now define the inner product in ${\cal H}$ as follows:It will be noted that Eq. (120) represents an integral transform of $F(t)$, including, for a suitable choice of $A$ and of the domain $T$, the Fourier, Fresnel, and Hankel cases, to name a few.

We can give a very simple example. Let $T = [- {\nu _M},{\nu _M}]$, $A(t,x) = {e^{2\pi itx}}$, and $w(t) = 1$. Then, Eq. (118) gives

A more elaborate example is obtained on taking

As a final example consider the following kernel,

with constant ${K_0},\;\alpha \gt 0,\;\beta$. The kernel from Eq. (133) can be written## 9. PSEUDOMODAL EXPANSIONS

Let us consider a kernel of the form from Eq. (118) and the associated Hilbert space ${\cal H}$. Let $\{{F_n}(t)\} _{n = 0}^\infty$ be a basis in ${L^2}(T)$. Then the functions obtained through Eq. (120), say, $\{{f_n}(x)\} _{n = 0}^\infty$, form a basis in ${\cal H}$. To prove this let us first note that the functions $\{{f_n}(x)\}$ are orthonormal. In fact [see Eq. (123)],

Next note that if $f \in {\cal H}$ is such that ${\langle {f_n},f\rangle _{\cal H}} = 0,\;\;\forall n$ then $f = 0$. In fact, Since ${F_n}$ is a basis in ${L^2}(T)$ Eq. (138) implies $F = 0$, which in turn gives $f = 0$. Then the functions ${f_n}$ form a basis in ${\cal H}$.For any fixed $y = {y_0}$, $K(x,{y_0}) \in {\cal H}$ [see Eq. (32)]. Then

where [Eq. (120)]If we now fix $x = {x_0}$, we can proceed in a similar way and reach the expression

Following a procedure analogous to the previous one, we obtain that the convergence in Eq. (142) is pointwise convergence in $y$ for fixed ${x_0}$.In summary, we conclude that

with pointwise convergence.On the other hand, $\forall n$ we have that

is finite, in fact,By using $||{f_n}||_{{L^2}}^2 \lt \infty ,\;\forall n$, we rewrite Eq. (143) as

whereIt can be noted that Eq. (147) has the same form as Eq. (93), the so-called modal expansion. The functions ${\psi _n}$ though are (generally) different from the eigenfunctions. This is why Eq. (147) is called *pseudomodal* expansion [18]. The above procedure offers a rather simple way to construct such an expansion.

Let us work out an example using the following kernel,

where ${J_0}$ denotes the Bessel function of the first kind and zero order, $h(x) \in {L^2}(X)$, and $b$ is a positive constant. We can write [5]In this case we can use as ${F_n}(t)$

so that## 10. GENERALIZED SAMPLING

Working in RKHS, it is possible to generalize the sampling theorem to a vast class of cases [21,22]. We give here a basic derivation. With reference to the integral representation, let us suppose a countable set of values ${t_n}$ exists for which the functions $A(t,{t_n})$ form a complete orthonormal system. Then the typical function $F(t)$ appearing in Eq. (120) can be expanded as follows:

As a consequence, the function $f(x)$ represented by Eq. (120) takes the formTo see the connection with the traditional sampling, let us write $f(x)$ as

This has the form of Eq. (120) with $A$ given by The functions $A(x,n)$ (for integer $n$) form a complete orthonormal set in (${-}{\rm{1/2}},\;{\rm{1/2}}$). The associated kernel (118) is nothing but sinc $(x - n)$, so that Eq. (157) gives the ordinary sampling theorem.As an example, let us consider again the Fresnel transform [20], denoted for a typical function $f(t)$ with the symbol ${\hat f_\alpha}(x)$, which is defined as

A function $f(\cdot)$ is said to be $(\alpha)$-Fresnel limited if its Fresnel transform ${\hat f_\alpha}(x)$ vanishes outside an interval $|x| \lt {\xi _0}$. Assuming $f$ to be $\alpha$-limited, let us evaluate $K(x,y)$, using for $A$, ${A^*}$ the expressions deducible from Eqs. (160) and (162):

## 11. REPRODUCING KERNEL HILBERT SPACES IN OPTICS

The interest of RKHSs in optics stems, first of all, from the Moore–Aronszajn theorem, according to which every RKHS possesses a non-negative definite reproducing kernel $K(x,y)$ and conversely every non-negative definite kernel specifies a RKHS. Since any cross-spectral density (CSD) has to be non-negative definite [3], we can say that any CSD specifies a RKHS and vice versa. This means that the very large number of CSDs devised throughout several years in coherence theory offers a good pool of examples for the theory of RKHSs. Quite often, the integral equations associated to CSDs were solved explicitly, thus giving the pertinent eigenfunctions and eigenvalues. On the other hand, there is a collection of reproducing kernels thoroughly studied in mathematics that did not penetrate into optics. Szegö’s and Bergman’s kernels are examples of this and could deserve attention from the opticist.

It is to be stressed that the knowledge of the RKHS associated to a certain CSD gives access to the regularity features of the underlying optical fields (Sections 6 and 10). This is of interest whenever the associated optical fields are to be processed with such operations as filtering or sampling.

The integral representation of RKs (Section 8) offered a safe criterion for devising genuine CSDs in hundreds of papers. It is worthwhile to remark that such a representation has a further merit in that it specifies an integral transform. Accordingly, beside well-known transforms (such as Fourier, Fresnel, and Hankel) there is a wealth of new transforms that would deserve to be explored.

A stimulating challenge is to inquire whether, in addition to known cases, it is possible to transform other possibilities offered by RKHSs into laboratory techniques. Let us refer to a concrete example: in Eq. (56) we saw a case in which the inner product of two signals requires one to integrate the product of their first derivatives. It does not seem that great difficulties should arise in converting such an operation in an experimental procedure, possibly combining analogical and numerical techniques. Similar possibilities exist for several other operations involved in RKHS analysis. This affords a rich field of action to the experimentalist in addition to the theoretical subjects treated above.

## APPENDIX A: FUNCTIONAL ANALYSIS VIEW

In the present paper, we followed an approach to RKHSs in which we used the type of mathematical tools adopted in Fourier optics and in coherence theory. Current alternative presentations rely more often on concepts of functional analysis. For the sake of completeness as well as a suggestion toward extensions, we give here some hints in this direction.

A *functional* in a Hilbert space of functions is an operator $F$ that, upon application to a typical function $f(\cdot)$, gives rise to a (generally complex) number ${a_f}$, i.e.,

*linear*functionals, i.e., functionals such that with obvious symbols. A simple example is as follows. Given a function $g(\cdot)$, the inner product of a typical function $f(\cdot)$ with $g(\cdot)$ gives a (complex) number. Thus we created the simple functional where $g$ is fixed. In a sense, the function $g(\cdot)$ represents the functional $F$. An amazing result is that under reasonable regularity hypotheses all the possible functionals can be given the present form. To express this in precise terms we need some more definitions.

A linear functional is said to be *continuous* if $\forall \varepsilon \gt 0$ there exists a ${\delta _\varepsilon} \gt 0$ such that if $||f|| \lt {\delta _\varepsilon}$ then $|F\{f\} | \lt \varepsilon$.

A linear functional is said to be *bounded* if $\exists M \gt 0$ such that $\forall f(\cdot)$ we have $|F\{f(\cdot)\} | \le M||f||$ where $M$ does not depend on $f$.

It can be proved that a continuous functional is bounded and vice versa.

Now we can give the fundamental *representation theorem* of Riesz and Frechét [13]: *In a Hilbert space of functions any continuous (or bounded) functional*$F$ *can be given the form of an inner product*$F\{f(\cdot)\} = {\langle f(\cdot),g(\cdot)\rangle _{\cal H}}$ *where* $g$ *depends on* $F$ *but not on* $f$.

A peculiar type of functional is the so-called *evaluation* functional at $y \in X$. Denoting it by ${V_y}$, we define it as one that when applied to $f(\cdot)$ gives rise to $f(y)$, i.e.,

Compared with the reproduction property from Eq. (33), we conclude that ${v_y}(\cdot) = K(\cdot ,y)$.

With the above elements it is possible to show that a RKHS can be defined as follows: *A Hilbert space of functions defined on* $X$ *is a RKHS if the evaluation functional at* $x$ *is continuous*$\;\forall x \in X$.

It will be noted that the above definition does not mention the reproduction property, which is instead proved as a consequence of the definition itself.

## Funding

Ministerio de Economía y Competitividad (PID2019-104268GB-C21).

## Acknowledgment

The authors gratefully thank Gemma Piquero, Juan Carlos Gonzáles de Sande, and Massimo Santarsiero for useful discussions on the subject matter of this paper.

## Disclosures

The authors declare no conflicts of interest.

## Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

## REFERENCES

**1. **J. W. Goodman, *Introduction to Fourier Optics* (McGraw-Hill, 1968).

**2. **A. Papoulis, *Systems and Transforms with Applications in Optics* (McGraw-Hill, 1968).

**3. **L. Mandel and E. Wolf, *Optical Coherence and Quantum Optics* (Cambridge University, 1995).

**4. **A. Berlinet and C. Thomas-Agnan, *Reproducing Kernel Hilbert Spaces in Probability and Statistics* (Kluwer, Academic, 2004).

**5. **M. Abramowitz and I. Stegun, *Handbook of Mathematical Functions* (National Bureau of Standards, 1972).

**6. **G. Szego, *Orthogonal Polynomials* (American Mathematical Society, 1975).

**7. **S. Saitoh and Y. Sawano, *Theory of Reproducing Kernels and Applications* (Springer, 2016).

**8. **R. Martínez-Herrero and F. Gori, “Christoffel-Darboux sources,” Opt. Lett. **46**, 973–976 (2021). [CrossRef]

**9. **N. Aronszajn, “Theory of reproducing kernels,” Trans. Am. Math. Soc. **68**, 337–404 (1950). [CrossRef]

**10. **S. Saitoh, *Integral Transforms, Reproducing Kernels and Their Applications* (Longman, 1997).

**11. **V. I. Paulsen and M. Raghupathi, *An Introduction to the Theory of Reproducing Kernel Hilbert Spaces* (Cambridge, 2016).

**12. **R. A. Kennedy and P. Sadeghi, *Hilbert Space Methods in Signal Processing* (Cambridge University, 2013).

**13. **F. Riesz and B. Sz. Nagy, *Functional Analysis* (Blackie and Sons, 1956).

**14. **F. G. Tricomi, *Integral Equations* (Dover, 1985).

**15. **M. Santarsiero, G. Piquero, J. C. G. de Sande, and F. Gori, “Difference of cross-spectral densities,” Opt. Lett. **39**, 1713–1716 (2014). [CrossRef]

**16. **E. Parzen, “An approach to time series analysis,” Ann. Math. Stat. **32**, 951–989 (1961). [CrossRef]

**17. **F. Gori and M. Santarsiero, “Devising genuine spatial correlation functions,” Opt. Lett. **32**, 3531–3533 (2007). [CrossRef]

**18. **R. Martínez-Herrero, P. M. Mejías, and F. Gori, “Genuine cross-spectral densities and pseudo-modal expansions,” Opt. Lett. **34**, 1399–1401 (2009). [CrossRef]

**19. **R. L. Eubank and T. Hsing, “Canonical correlation for stochastic processes,” Stochastic Process. Appl. **118**, 1634–1661 (2008). [CrossRef]

**20. **F. Gori, “Fresnel transform and sampling theorem,” Opt. Commun. **39**, 293–297 (1981). [CrossRef]

**21. **A. G. Garcia, “Orthogonal sampling formulas: a unified approach,” SIAM Rev. **42**, 499–512 (2000). [CrossRef]

**22. **D. Han, M. Z. Nashed, and Q. Sun, “Sampling expansions in reproducing kernel Hilbert and banach spaces,” Num. Funct. Anal. Optim. **30**, 971–987 (2009). [CrossRef]