Background
When I was high schooling at TDV-BTL (Türkiye Diyanet Vakfı – Bakü Türk Lisesi), I happened to be thinking about functions and their properties during the 10/11 grade(s). I was mainly thinking about what happens when we change the formula of a function a little bit; do its outputs completely change by large offset or is it predictable what the function looked like before changing its formula syntactically just by looking at its new inputs and outputs? One day something interesting occurred to me as I was playing with arbitrary linear and quadratic functions – when I fixed the inputs and outputs on a table vertically, so that the (euclidean) difference between each successive input was the same, the consequent differences in the Y column did not change after some iteration, no matter how many rows(i.e., data points or x and y pairs) I added to the table as long as I respected the same-successive-difference rule on X column. For linear functions, I just observed that the differences between the successive y values in Y column were the same after putting them on a new column, say, Y’, and for quadratic functions, the successive difference were stabilized when I repeated the same procedure for Y’ column where I would put the new differential values on a new column, say, Y”. Then I thought that if I just knew at what iteration the differences were stabilized, then I would also propagate this difference backwards from some
to Y column; this would mean that I am able to find the next y value on the Y column for the next x value on the X column that was not initially given to me, and since it did not matter which (polynomial) function I used to generate the value for the initial [X | Y] table for the successive differences to be stabilized after some iteration(s), it meant that someone else could give this values without telling me the generator function itself and I would still be able to find the next (x, y) pair just by using the table.
This was just the beginning of a very long and interesting adventure that I wasn’t even aware of until recently. I wasn’t doing what I was doing back then because I knew what I was doing or I was aware of the significance of prediction in general; I hadn’t even heard about AI yet. I was just having fun playing with functions and their input/output values. As a young student, I was very curious about mathematics, and it was the only reason I used to try stupid stuff to see if it went somewhere interesting.
It is not interesting to try to solely capture what’s changing, rather it is very important to capture what is not changing when everything else’s changing.
It wasn’t until two years ago that I realized that all that matters is to find invariants in your problem. That’s when I took note of my own quote above. An invariant is something that is stable all the way from the beginning to the end of some process. I also used to think about universal invariants that defined the very reality we live in; I called them primes of the universe. I also used to think of physics rules as such primes, and as was the case with natural numbers in mathematics, I believed that combining different primes (even maybe with a different order, unlike natural numbers) would yield different events or processes in our reality. So, the physics rules, such as gravity, would not change at all, although the everyday events were completely dynamic and changing. However, all that was the product of my habit of wandering around in the imagination space too much.
Without diving too much into my crazy imagination space, I would like to emphasize a few points on why this observation was actually a very significant finding:
- It actually brought the idea (probably one of nowadays’ very important ideas) of propagating the difference backward,
- It was my first notable systematic try to predict the behavior of a function by only using its input and output values,
- It helped me (unconsciously) realize how invariants are actually important.
I tried to prove why my algorithm was working correctly at the same time. Back then, when I realized the P.I.P. algorithm for the first time, I couldn’t see where that invariant difference value came from; I observed that it was not the same for some of the different quadratic functions while being the same for other “special” set of different quadratic functions. I repeated the experiments for higher-degree polynomials, and only then was I able to get a sense of what was going on – it was the difference in the X column, the highest degree of a polynomial function, and the coefficient of its corresponding term that decided the final difference value on some Y’‘ column. For example, the set of samples for
and
would give the same difference value on Y” if and only if the differences of successive pairs on X columns for two sets of samples were the same.
It was not too hard to also realize another interesting property after realizing the previous one – when the difference on the X column was 1, the difference on Y was going to be (where
is the highest degree, and
is the coefficient of its corresponding term). Since I had such intuitions about what was going on, it did not take me too long to prove the correctness of the “mysterious” P.I.P. theorem. Of course, it ceased to be mysterious right after I finished proving the theorem, as it always does. I am going to share the theorem and its proof in this blog post, but I will not write about how this theorem has empowered my imagination to invent new things, at least not in this post.
Theory.
Theorem. Let be a polynomial function and
be the set of
ordered samples of
where
, that is, the euclidean distance between each consequent
is always the same. Then the next
pair after the last
can be deduced by the following procedure:
- Let
- Let
- Let
, and
where for any
, but has not been changed to
in order to maintain the harmony between the
pair.
Remark. For any positive integer power function or polynomial , the equality
is correct of invariant, and to obtain such invariant, there must be at least
ordered samples.
Proof. Let be the positive integer power function (i.e., polynomial), and (X, Y) be the ordered set of samples of
such that
. Then, the following equalities can be deduced easily:
Since is going to be multiplied by
when
, we can omit the first term in the summation formula as shown below:
It is obvious that is also a polynomial(i.e., positive integer power function), and therefore, must have the same general formula as any polynomial function. The only exception is that the degree of the current polynomial will be one less than the previous one since when
, the highest power of the variable
is
, which can be observed as shown below:
Now, let’s try to figure out how many terms we can group together after expanding . We can solve this problem just by acknowledging one fact – each differential(i.e.,
)) is expanded into a sequence of terms with the degrees going from
to
. Therefore, an expansion of each differential with the degree of
is, by itself, another polynomial of the degree
. For example, if
, then the following expansions must be observed:
We can also use the Pascal triangle to calculate the coefficients of each term in the expansion. The ultimate formula for with the polynomial degree of
could be written as follows:
For mathematical conveniency, let and
. Now, we can deduce the general formula for
by also adding up the fact that for each depth
,
has to consider only the highest degree of its ancestor polynomial as the new highest polynomial degree
(e.g.,
since
has the polynomial degree of
, but since
has the polynomial degree of
,
):
We can use the above-shown formula to find the invariant within the recursion of differentials. It is obvious that since we want to obtain an invariant, we need to find a value of such that there is no free variable to be found
. It is easy to observe that if we set
, then
which is free of
. Now, we can use the formula for
to find one particular
:
By multiplying each consequent , we can rewrite the whole equality as shown below and deduce its final expression:
Therefore, we have successfully shown that there exists an invariant at some depth in the recursion of differentials, and the first occurrence of such invariant can be computed by using the formula.