An automatic report for the dataset : Walmart (WMT)
Abstract
This report was produced by the Automatic Bayesian Covariance Discovery (ABCD) algorithm.
1 Executive summary
The raw data and full model posterior with extrapolations are shown in figure 1.
The structure search algorithm has identified nine additive components in the data. The first 4 additive components explain 96.5% of the variation in the data as shown by the coefficient of determination (${R}^{2}$) values in table 1. The 9 additive components explain 100.0% of the variation in the data. After the first 4 components the cross validated mean absolute error (MAE) does not decrease by more than 0.1%. This suggests that subsequent terms are modelling very short term trends, uncorrelated noise or are artefacts of the model or search procedure. Short summaries of the additive components are as follows:

•
A constant. This function applies until 19 Jun 2001, from 22 Jun 2001 until 22 Sep 2001 and from 26 Sep 2001 onwards.

•
A smooth function with linearly increasing marginal standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001.

•
A very smooth function with linearly increasing marginal standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001.

•
A smooth function. This function applies until 22 Sep 2001 and from 26 Sep 2001 onwards.

•
Uncorrelated noise. This function applies until 31 Jul 2001.

•
Uncorrelated noise. This function applies from 31 Jul 2001 onwards.

•
Uncorrelated noise with linearly increasing standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001.

•
Uncorrelated noise with linearly increasing standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001.

•
A smooth function. This function applies from 22 Sep 2001 until 26 Sep 2001.
#  ${R}^{2}$ (%)  $\mathrm{\Delta}{R}^{2}$ (%)  Residual ${R}^{2}$ (%)  Cross validated MAE  Reduction in MAE (%) 

        40.41   
1  145.9  145.9  145.9  3.22  92.0 
2  7.7  138.3  56.2  2.15  33.3 
3  82.3  90.0  83.6  1.34  37.4 
4  96.5  14.2  80.2  1.32  2.1 
5  96.7  0.2  5.6  1.32  0.0 
6  98.1  1.4  41.9  1.32  0.0 
7  98.4  0.4  18.9  1.32  0.0 
8  98.4  0.0  0.1  1.32  0.0 
9  100.0  1.6  100.0  1.32  0.1 
Model checking statistics are summarised in table 2 in section 4. These statistics have revealed statistically significant discrepancies between the data and model in component 2.
The rest of the document is structured as follows. In section 2 the forms of the additive components are described and their posterior distributions are displayed. In section 3 the modelling assumptions of each component are discussed with reference to how this affects the extrapolations made by the model. Section 4 discusses model checking statistics, with plots showing the form of any detected discrepancies between the model and observed data.
2 Detailed discussion of additive components
2.1 Component 1 : A constant. This function applies until 19 Jun 2001, from 22 Jun 2001 until 22 Sep 2001 and from 26 Sep 2001 onwards
This component is constant. This component applies until 19 Jun 2001, from 22 Jun 2001 until 22 Sep 2001 and from 26 Sep 2001 onwards.
This component explains 145.9% of the total variance. The addition of this component reduces the cross validated MAE by 92.0% from 40.4 to 3.2.
2.2 Component 2 : A smooth function with linearly increasing marginal standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001
This component is a smooth function with a typical lengthscale of 5.1 weeks. The marginal standard deviation of the function increases linearly. This component applies from 22 Sep 2001 until 26 Sep 2001.
This component explains 56.2% of the residual variance; this increases the total variance explained from 145.9% to 7.7%. The addition of this component reduces the cross validated MAE by 33.31% from 3.22 to 2.15.
2.3 Component 3 : A very smooth function with linearly increasing marginal standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001
This component is a very smooth function. The marginal standard deviation of the function increases linearly. This component applies from 19 Jun 2001 until 22 Jun 2001.
This component explains 83.6% of the residual variance; this increases the total variance explained from 7.7% to 82.3%. The addition of this component reduces the cross validated MAE by 37.38% from 2.15 to 1.34.
2.4 Component 4 : A smooth function. This function applies until 22 Sep 2001 and from 26 Sep 2001 onwards
This component is a smooth function with a typical lengthscale of 2.4 days. This component applies until 22 Sep 2001 and from 26 Sep 2001 onwards.
This component explains 80.2% of the residual variance; this increases the total variance explained from 82.3% to 96.5%. The addition of this component reduces the cross validated MAE by 2.09% from 1.34 to 1.32.
2.5 Component 5 : Uncorrelated noise. This function applies until 31 Jul 2001
This component models uncorrelated noise. This component applies until 31 Jul 2001.
This component explains 5.6% of the residual variance; this increases the total variance explained from 96.5% to 96.7%. The addition of this component reduces the cross validated MAE by 0.00% from 1.32 to 1.32. This component explains residual variance but does not improve MAE which suggests that this component describes very short term patterns, uncorrelated noise or is an artefact of the model or search procedure.
2.6 Component 6 : Uncorrelated noise. This function applies from 31 Jul 2001 onwards
This component models uncorrelated noise. This component applies from 31 Jul 2001 onwards.
This component explains 41.9% of the residual variance; this increases the total variance explained from 96.7% to 98.1%. The addition of this component reduces the cross validated MAE by 0.00% from 1.32 to 1.32. This component explains residual variance but does not improve MAE which suggests that this component describes very short term patterns, uncorrelated noise or is an artefact of the model or search procedure.
2.7 Component 7 : Uncorrelated noise with linearly increasing standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001
This component models uncorrelated noise. The standard deviation of the noise increases linearly. This component applies from 19 Jun 2001 until 22 Jun 2001.
This component explains 18.9% of the residual variance; this increases the total variance explained from 98.1% to 98.4%. The addition of this component reduces the cross validated MAE by 0.00% from 1.32 to 1.32. This component explains residual variance but does not improve MAE which suggests that this component describes very short term patterns, uncorrelated noise or is an artefact of the model or search procedure.
2.8 Component 8 : Uncorrelated noise with linearly increasing standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001
This component models uncorrelated noise. The standard deviation of the noise increases linearly. This component applies from 22 Sep 2001 until 26 Sep 2001.
This component explains 0.1% of the residual variance; this increases the total variance explained from 98.4% to 98.4%. The addition of this component reduces the cross validated MAE by 0.00% from 1.32 to 1.32. This component neither explains residual variance nor improves MAE and therefore is likely to be an artefact of the model or search procedure.
2.9 Component 9 : A smooth function. This function applies from 22 Sep 2001 until 26 Sep 2001
This component is a smooth function with a typical lengthscale of 38.6 hours. This component applies from 22 Sep 2001 until 26 Sep 2001.
This component explains 100.0% of the residual variance; this increases the total variance explained from 98.4% to 100.0%. The addition of this component increases the cross validated MAE by 0.05% from 1.32 to 1.32. This component explains residual variance but does not improve MAE which suggests that this component describes very short term patterns, uncorrelated noise or is an artefact of the model or search procedure.
3 Extrapolation
Summaries of the posterior distribution of the full model are shown in figure 19. The plot on the left displays the mean of the posterior together with pointwise variance. The plot on the right displays three random samples from the posterior.
Below are descriptions of the modelling assumptions associated with each additive component and how they affect the predictive posterior. Plots of the pointwise posterior and samples from the posterior are also presented, showing extrapolations from each component and the cuulative sum of components.
3.1 Component 1 : A constant. This function applies until 19 Jun 2001, from 22 Jun 2001 until 22 Sep 2001 and from 26 Sep 2001 onwards
This component is assumed to stay constant.
3.2 Component 2 : A smooth function with linearly increasing marginal standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001
This component is assumed to stop before the end of the data and will therefore be extrapolated as zero.
3.3 Component 3 : A very smooth function with linearly increasing marginal standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001
This component is assumed to stop before the end of the data and will therefore be extrapolated as zero.
3.4 Component 4 : A smooth function. This function applies until 22 Sep 2001 and from 26 Sep 2001 onwards
This component is assumed to continue smoothly but is also assumed to be stationary so its distribution will return to the prior. The prior distribution places mass on smooth functions with a marginal mean of zero and a typical lengthscale of 2.4 days. [This is a placeholder for a description of how quickly the posterior will start to resemble the prior].
3.5 Component 5 : Uncorrelated noise. This function applies until 31 Jul 2001
This component is assumed to stop before the end of the data and will therefore be extrapolated as zero.
3.6 Component 6 : Uncorrelated noise. This function applies from 31 Jul 2001 onwards
This component assumes the uncorrelated noise will continue indefinitely.
3.7 Component 7 : Uncorrelated noise with linearly increasing standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001
This component is assumed to stop before the end of the data and will therefore be extrapolated as zero.
3.8 Component 8 : Uncorrelated noise with linearly increasing standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001
This component is assumed to stop before the end of the data and will therefore be extrapolated as zero.
3.9 Component 9 : A smooth function. This function applies from 22 Sep 2001 until 26 Sep 2001
This component is assumed to stop before the end of the data and will therefore be extrapolated as zero.
4 Model checking
Several posterior predictive checks have been performed to assess how well the model describes the observed data. These tests take the form of comparing statistics evaluated on samples from the prior and posterior distributions for each additive component. The statistics are derived from autocorrelation function (ACF) estimates, periodograms and quantilequantile (qq) plots.
Table 2 displays cumulative probability and $p$value estimates for these quantities. Cumulative probabilities near 0/1 indicate that the test statistic was lower/higher under the posterior compared to the prior unexpectedly often i.e. they contain the same information as a $p$value for a twotailed test and they also express if the test statistic was higher or lower than expected. $p$values near 0 indicate that the test statistic was larger in magnitude under the posterior compared to the prior unexpectedly often.
ACF  Periodogram  
#  min  min loc  max  max loc  max  min 
1  0.502  0.480  0.091  0.525  0.473  0.527 
2  0.479  0.283  0.018  0.712  0.297  0.404 
3  0.642  0.354  0.037  0.543  0.264  0.538 
4  0.483  0.573  0.122  0.265  0.108  0.102 
5  0.507  0.488  0.463  0.506  0.546  0.513 
6  0.499  0.518  0.523  0.508  0.211  0.219 
7  0.498  0.514  0.507  0.474  0.424  0.422 
8  0.481  0.505  0.489  0.464  0.484  0.503 
9  0.614  0.745  0.210  0.313  0.271  0.286 
The nature of any observed discrepancies is now described and plotted and hypotheses are given for the patterns in the data that may not be captured by the model.
4.1 Moderately statistically significant discrepancies
4.1.1 Component 2 : A smooth function with linearly increasing marginal standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001
The following discrepancies between the prior and posterior distributions for this component have been detected.

•
The maximum value of the periodogram is unexpectedly low. This discrepancy has an estimated $p$value of 0.036.
4.2 Model checking plots for components without statistically significant discrepancies
4.2.1 Component 1 : A constant. This function applies until 19 Jun 2001, from 22 Jun 2001 until 22 Sep 2001 and from 26 Sep 2001 onwards
No discrepancies between the prior and posterior of this component have been detected
4.2.2 Component 3 : A very smooth function with linearly increasing marginal standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001
No discrepancies between the prior and posterior of this component have been detected
4.2.3 Component 4 : A smooth function. This function applies until 22 Sep 2001 and from 26 Sep 2001 onwards
No discrepancies between the prior and posterior of this component have been detected
4.2.4 Component 5 : Uncorrelated noise. This function applies until 31 Jul 2001
No discrepancies between the prior and posterior of this component have been detected
4.2.5 Component 6 : Uncorrelated noise. This function applies from 31 Jul 2001 onwards
No discrepancies between the prior and posterior of this component have been detected
4.2.6 Component 7 : Uncorrelated noise with linearly increasing standard deviation. This function applies from 19 Jun 2001 until 22 Jun 2001
No discrepancies between the prior and posterior of this component have been detected
4.2.7 Component 8 : Uncorrelated noise with linearly increasing standard deviation. This function applies from 22 Sep 2001 until 26 Sep 2001
No discrepancies between the prior and posterior of this component have been detected
4.2.8 Component 9 : A smooth function. This function applies from 22 Sep 2001 until 26 Sep 2001
No discrepancies between the prior and posterior of this component have been detected
5 MMD  experimental section
#  mmd 

1  0.000 
2  0.000 
3  0.000 
4  0.000 
5  0.300 
6  0.000 
7  0.130 
8  0.152 
9  0.130 