Question Incorect ML Time Series forecasting with big values RRS feed

vlad.ua96

Member
Joined
Nov 22, 2020
Messages
5
Programming Experience
3-5
Hey there) Was making very simple stock prediction app (Simple Colege Project). For feracast i used Microsoft ML Time Series Model. It worked good but i noticed when i make predictions for data with big values like 1700 or 3800 it makes very incorrect predictions. For instance if csv file has small values like 20.21 or 10.63 it works good and predicts similar values in that or close to that range. But if csv file has columns with great values like 3700 or so it just goes crazy and splits prediction values in half. For example for values in range 3136 it made prediction 1800 or 700 depends on model parameters.
C#:
var context = new MLContext();
    var data = context.Data.LoadFromTextFile<StockData>(@"C:\Users\Vlad Mishyn\Desktop\DIPLOM1\ASIX\ASIX\bin\x64\Debug\FinalCsvFile.csv", hasHeader: true, separatorChar: ',');

    var pipeline = context.Forecasting.ForecastBySsa(
                                        "Forecast",
                                        nameof(StockData.Close),
                                        windowSize: 7,
                                        seriesLength: 30,
                                        trainSize:365,
                                        horizon: 7,
                                        confidenceLevel: 0.95f,
                                        confidenceLowerBoundColumn: "LowerBoundRentals",
                                        confidenceUpperBoundColumn: "UpperBoundRentals"
                                        );


    var model = pipeline.Fit(data);

    var forecastingEngine = model.CreateTimeSeriesEngine<StockData, StockForecast>(context);

    var forecasts = forecastingEngine.Predict();

internal class StockForecast
    {
        public float [] Forecast { get; set; }

    }

internal class StockData
    {
        [LoadColumn(0)]
        public DateTime Date { get; set; }

        [LoadColumn(1)]
        public float Close { get; set; }

        [LoadColumn(2)]
        public float Volume { get; set; }


        [LoadColumn(3)]
        public float Open { get; set; }

        [LoadColumn(4)]
        public float High { get; set; }


        [LoadColumn(5)]
        public float Law { get; set; }

    }

I thought maybe i chosed wrong model for prediction but i saw a lot of people were using this model for prices with more biger values. I was playing with all models parameters in order to find optimal one, but it didn't help. I am new to data prediction subject but even in Python models i've havent seen such model behavior. Just really interested in our opinion about that. This is the way csv file looks like. Unfortunatelly whole file doesn't attach to this post

Date,Close,Volume,Open,High,Law
11/13/2020,19.61,6527357,19.06,19.3581,18.69
11/12/2020,18.93,8879036,19.21,19.3581,18.69
11/11/2020,19.36,9736736,19.92,19.3581,18.69
11/10/2020,19.87,11227230,19.77,19.3581,18.69
11/09/2020,19.73,13947140,19.92,19.3581,18.69
11/06/2020,19.25,6515724,19.1,19.3581,18.69
11/05/2020,19.13,8926402,18.6,19.3581,18.69
11/04/2020,18.28,8512730,18.64,19.3581,18.69
11/03/2020,18.62,6340196,18.68,19.3581,18.69
11/02/2020,18.41,7499051,18.18,19.3581,18.69
10/30/2020,17.96,7883718,17.63,19.3581,18.69
10/29/2020,17.78,7722001,17.28,19.3581,18.69
 
Last edited by a moderator:
Please remember to put your code in code tags. I've done that for you this time.
 
Do you know the math that goes behind the learning and prediction for that engine? If so, have you tried calculating by hand? Are the results you are getting similar or different from what the engine is computing.

Machine Learning is not magic. It's all an application of math -- in particular statistics and probabilities.
 
Do you know the math that goes behind the learning and prediction for that engine? If so, have you tried calculating by hand? Are the results you are getting similar or different from what the engine is computing.

Machine Learning is not magic. It's all an application of math -- in particular statistics and probabilities.
Yes recently i tried to do same prediction in Python and it worked good and result were in same range as training data. I just missing something but can't understand what exactly wrong. It works correctly with small numbers but not with big.
 
Back
Top Bottom