What's the point of time series analysis?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
25
down vote
favorite
What is the point of time series analysis?
There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while machine learning is great for prediction.
But meanwhile, I don't see what time series analysis is good for. Sure, I can fit an ARIMA model and use it for prediction, but what good is that when the confidence intervals for that prediction are going to be huge? There's a reason nobody can predict the stock market despite it being the most data-driven industry in world history.
Likewise, how do I use it to understand my process further? Sure, I can plot the ACF and go "aha! there's some dependence!", but then what? What's the point? Of course there's dependence, that's why you are doing time series analysis to begin with. You already knew there was dependence. But what are you going to use it for?
time-series arima
 |Â
show 8 more comments
up vote
25
down vote
favorite
What is the point of time series analysis?
There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while machine learning is great for prediction.
But meanwhile, I don't see what time series analysis is good for. Sure, I can fit an ARIMA model and use it for prediction, but what good is that when the confidence intervals for that prediction are going to be huge? There's a reason nobody can predict the stock market despite it being the most data-driven industry in world history.
Likewise, how do I use it to understand my process further? Sure, I can plot the ACF and go "aha! there's some dependence!", but then what? What's the point? Of course there's dependence, that's why you are doing time series analysis to begin with. You already knew there was dependence. But what are you going to use it for?
time-series arima
13
There are other use cases apart from finance and economics where they work fine.
â user2974951
Sep 19 at 8:18
28
You cant predict stock market using other statistical & machine learning methods either, does this make them useless as well..?
â Timâ¦
Sep 19 at 9:53
15
You seem to imply ARIMA is not a form of regression. It is.
â Firebug
Sep 19 at 12:24
9
I think this is a good question, even if the answer seems obvious to experts.
â gungâ¦
Sep 19 at 20:19
6
I beg to differ with @gung and others, because the least amount of research effort would answer it.
â whuberâ¦
Sep 20 at 1:20
 |Â
show 8 more comments
up vote
25
down vote
favorite
up vote
25
down vote
favorite
What is the point of time series analysis?
There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while machine learning is great for prediction.
But meanwhile, I don't see what time series analysis is good for. Sure, I can fit an ARIMA model and use it for prediction, but what good is that when the confidence intervals for that prediction are going to be huge? There's a reason nobody can predict the stock market despite it being the most data-driven industry in world history.
Likewise, how do I use it to understand my process further? Sure, I can plot the ACF and go "aha! there's some dependence!", but then what? What's the point? Of course there's dependence, that's why you are doing time series analysis to begin with. You already knew there was dependence. But what are you going to use it for?
time-series arima
What is the point of time series analysis?
There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while machine learning is great for prediction.
But meanwhile, I don't see what time series analysis is good for. Sure, I can fit an ARIMA model and use it for prediction, but what good is that when the confidence intervals for that prediction are going to be huge? There's a reason nobody can predict the stock market despite it being the most data-driven industry in world history.
Likewise, how do I use it to understand my process further? Sure, I can plot the ACF and go "aha! there's some dependence!", but then what? What's the point? Of course there's dependence, that's why you are doing time series analysis to begin with. You already knew there was dependence. But what are you going to use it for?
time-series arima
time-series arima
edited Sep 19 at 12:25
Firebug
6,95423075
6,95423075
asked Sep 19 at 8:11
Dhalsim
13823
13823
13
There are other use cases apart from finance and economics where they work fine.
â user2974951
Sep 19 at 8:18
28
You cant predict stock market using other statistical & machine learning methods either, does this make them useless as well..?
â Timâ¦
Sep 19 at 9:53
15
You seem to imply ARIMA is not a form of regression. It is.
â Firebug
Sep 19 at 12:24
9
I think this is a good question, even if the answer seems obvious to experts.
â gungâ¦
Sep 19 at 20:19
6
I beg to differ with @gung and others, because the least amount of research effort would answer it.
â whuberâ¦
Sep 20 at 1:20
 |Â
show 8 more comments
13
There are other use cases apart from finance and economics where they work fine.
â user2974951
Sep 19 at 8:18
28
You cant predict stock market using other statistical & machine learning methods either, does this make them useless as well..?
â Timâ¦
Sep 19 at 9:53
15
You seem to imply ARIMA is not a form of regression. It is.
â Firebug
Sep 19 at 12:24
9
I think this is a good question, even if the answer seems obvious to experts.
â gungâ¦
Sep 19 at 20:19
6
I beg to differ with @gung and others, because the least amount of research effort would answer it.
â whuberâ¦
Sep 20 at 1:20
13
13
There are other use cases apart from finance and economics where they work fine.
â user2974951
Sep 19 at 8:18
There are other use cases apart from finance and economics where they work fine.
â user2974951
Sep 19 at 8:18
28
28
You cant predict stock market using other statistical & machine learning methods either, does this make them useless as well..?
â Timâ¦
Sep 19 at 9:53
You cant predict stock market using other statistical & machine learning methods either, does this make them useless as well..?
â Timâ¦
Sep 19 at 9:53
15
15
You seem to imply ARIMA is not a form of regression. It is.
â Firebug
Sep 19 at 12:24
You seem to imply ARIMA is not a form of regression. It is.
â Firebug
Sep 19 at 12:24
9
9
I think this is a good question, even if the answer seems obvious to experts.
â gungâ¦
Sep 19 at 20:19
I think this is a good question, even if the answer seems obvious to experts.
â gungâ¦
Sep 19 at 20:19
6
6
I beg to differ with @gung and others, because the least amount of research effort would answer it.
â whuberâ¦
Sep 20 at 1:20
I beg to differ with @gung and others, because the least amount of research effort would answer it.
â whuberâ¦
Sep 20 at 1:20
 |Â
show 8 more comments
7 Answers
7
active
oldest
votes
up vote
50
down vote
One main use is forecasting. I have been feeding my family for over a decade now by forecasting how many units of a specific product a supermarket will sell tomorrow, so he can order enough stock, but not too much. There is money in this.
Other forecasting use cases are given in publications like the International Journal of Forecasting or Foresight. (Full disclosure: I'm an Associate Editor of Foresight.)
Yes, sometimes the prediction-intervals are huge. (I assume you mean PIs, not confidence-intervals. There is a difference.) This simply means that the process is hard to forecast. Then you need to mitigate. In forecasting supermarket sales, this means you need a lot of safety stock. In forecasting sea level rises, this means you need to build higher levees. I would say that a large prediction interval does provide useful information.
And for all forecasting use cases, time-series analyis is useful, though forecasting is a larger topic. You can often improve forecasts by taking the dependencies in your time series into account, so you need to understand them through analysis, which is more specific than just knowing dependencies are there.
Plus, people are interested in time series even if they do not forecast. Econometricians like to detect change points in macroeconomic time series. Or assess the impact of an intervention, such as a change in tax laws, on GDP or something else. You may want to skim through your favorite econometrics journal for more inspiration.
13
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
add a comment |Â
up vote
12
down vote
Goals in TS Analysis from the lesson-slides of M. Dettling:
1) Exploratory Analysis:
Visualization of the properties of the series
- time series plot
- decomposition into trend/seasonal pattern/random error
- correlogram for understanding the dependency structure
2) Modeling:
Fitting a stochastic model to the data that represents and
reflects the most important properties of the series
- done exploratory or with previous knowledge
- model choice and parameter estimation is crucial
- inference: how well does the model fit the data?
3) Forecasting:
Prediction of future observations with measure of uncertainty
- mostly model based, uses dependency and past data
- is an extrapolation, thus often to take with a grain of salt
- similar to driving a car by looking in the rear window mirror
4) Process Control:
The output of a (physical) process defines a time series
- a stochastic model is fitted to observed data
- this allows understanding both signal and noise
- it is feasible to monitor normal/abnormal fluctuations
5) Time Series Regression:
Modeling response time series using 1 or more input series
Fitting this model under i.i.d error assumption:
- leads to unbiased estimates, but...
- often grossly wrong standard errors
- thus, confidence intervals and tests are misleading
About the stock marked problem:
- These TS are very volatile, which is difficult to model.
- For example a change in a law that concerns the company could lead to a change in the TS process... how would any statistical tool predict that?
About serial correlation:
- In contrast to multivariate statistics, the data in a time series
are usually not iid, but are serially correlated. - This information can also be useful to detect something to be not iid, what is supposed to be, like for example a dirty laboratory instrument
1
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
1
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
add a comment |Â
up vote
9
down vote
The easiest way to answer your question is to understand that roughly the data sets are often categorized as cross-sectional, time series and panel. Cross-sectional regression is a go-to tool for the cross-sectional data sets. This is what most people know and refer to with a term regression. Time series regression is sometimes applied to time-series, but time series analysis has a wide range of tools beyond the regression.
Example of cross-sectional data is $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $x_i,y_i$ are weights and heights of randomly picked students in a school. When a sample is random we can often run a linear regression $ysim x$ and get reliable results, to maybe predict height $hat y$ of a student in this school knowing only student's weight $x$.
If the sample wasn't random, then the regression may not work at all. For instance, you picked only girls in first grade to estimate the model, but you have to predict the height of a male 12th grader. So, the regression has its own issues even in the cross-sectional setup.
Now, look at time series data, it could be $x_t,y_t$ such as $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $t$ the month of a year, and $x,y$ are still weight and height but of a particular student in this school.
Generally, regression doesn't have to work at all. One reason is that indices $t$ are ordered. So your sample is not random, and I mentioned earlier that regression prefers a random sample to work properly. This is a serious issue. Time series data tend to be persistent, e.g. your height this month is highly correlated to your height next month. In order to deal with these issues time series analysis was developed, it included the regression technique too, but it has to be used in certain ways.
The third common dataset type is a panel, particularly, the one iwth longitudinal data. Here, you may get several snapshots of weight and height variables for a number of students. This dataset may look like waves of cross-sections or a set of time series.
Naturally, this can be more complicated than previous two types. Here we use panel regression and other special techniques developed for panels.
Summarizing, the reason why time series regression is considered as a distinct tool compared to cross-sectional regression is that time series present unique challenges when it comes to independence assumptions of regression technique. Particularly, due to the fact that unlike in cross-sectional analysis, the order of observations matters, it usually leads to all kinds of correlation and dependence structures, which may sometimes invalidate application of regression techniques. You have to deal with dependence, and that's exactly what time series analysis is good at.
Predictability of Asset Prices
Also, you're repeating a common misconception about stock markets and asset prices in general, that they cannot be predicted. This statement is too general to be true. It's true that you can't outright predict next tick of AAPL reliably. However, it's a very narrow problem. If you cast your net wider you'll discover a lot of opportunities to make money using for all kinds of forecasting (and time series analysis in particular). Statistical arbitrage is one such field.
Now, the reason why asset prices are hard to predict in near term is due to the fact that a large component of price changes is new information. The truly new information that cannot be realistically devised from the past is by definition impossible to predict. However, this is an idealized model, and a lot of people would argue that the anomalies exist that allow for persistence of state. This means that the part of price change can be explained by the past. In such cases time series analysis is quite appropriate because it precisely deals with persistence. It separate new from old, new is impossible to predict, but old is dragged from the past into the future. If you can explain even a little bit, in finance it means that you may be able to make money. As long as the price of the strategy built on such forecasting covers the income generated by it.
Finally, take a look at the economics nobel prize in 2013: "it is quite possible to foresee the broad course of these prices over longer periods, such as the next three to five years." Take a look at Shiller's nobel lecture, he discusses forecastability of asset prices.
add a comment |Â
up vote
6
down vote
Time series analysis can also contribute to effective anomaly or outlier detection in temporal data.
As an example, it is possible to fit an ARIMA model and calculate a forecast interval. Depending on the use case, the interval can be used to set a threshold, within which the process can be said to be in control; if new data falls outside the threshold it is flagged for further attention.
This blog post has a brief and broad overview of time series analysis for outlier detection. For a more in-depth treatment, researchers at ebay explain how they carried out anomaly detection at scale based on the statistical analysis of time series data.
add a comment |Â
up vote
6
down vote
There are plenty of other statistical methods, such as regression and
machine learning, that have obvious use cases: regression can provide
information on the relationship between two variables, while machine
learning is great for prediction.
You answer your own question, below: autocorrelation. Time series usually have it, violating an assumption of basic OLS regression. Time series techniques have the appropriate assumptions for time series.
Machine learning methods which deal with sequential data are specialized, like recurrent neural networks (RNNs) or 1-D convolutional neural networks (CNNs), so you still have specialized techniques for time series.
But meanwhile, I don't see what time series analysis is good for.
Sure, I can fit an ARIMA model and use it for prediction, but what
good is that when the confidence intervals for that prediction are
going to be huge? There's a reason nobody can predict the stock market
despite it being the most data-driven industry in world history.
Confidence intervals (CIs) resulting from a time series technique will probably be larger than those from a non-time-series regression. This feature is known as being accurate. In general, when you use a non-time-series regression your CI will be smaller but it is incorrect because you've violated its assumptions. If all you want to do is present a graph with small CI's make them up or skip the CI's entirely, but if you want appropriate CIs use appropriate techniques.
The stock market is hard to predict because of it's nature. Other time series are much more predictable. Try using your machine learning technique of choice in the stock market and I doubt you'll have more success.
Likewise, how do I use it to understand my process further? Sure, I
can plot the ACF and go "aha! there's some dependence!", but then
what? What's the point? Of course there's dependence, that's why you
are doing time series analysis to begin with. You already knew there
was dependence. But what are you going to use it for?
To predict. To see seasonality. To have an idea about the variability of the data over different seasons. Not to mention that there are more powerful time series techniques than old-school ARIMA, like State Space methods. ARIMA is not the best technique for modeling time series. (In fact, the ARIMA procedure in your statistical software of choice is probably using a State Space representation under the hood.)
add a comment |Â
up vote
5
down vote
To add some color to the anomaly detection answer by redhqs, at work I build anomaly detection models for operational metrics like sales and traffic flows. We do the time series analysis to understand what sales ought to be if everything is working as expected, and then compare these to the observed values to see whether the web site is broken. It's important because for every minute the site is down we are losing a lot of money.
There are different methods you can use, and different methods try to accomplish different things in many cases. For example the main statistical method I have used for sales anomaly detection is called "STL" (seasonal-trend decomposition using loess). This separates the regular seasonality, trend and random noise. We actually use this to identify both daily and weekly seasonalities. Then we throw out the noise, and combine the trend/seasonalities to estimate expected sales. So in our case we use the approach to understand how sales vary with time of day and time of week, and to exclude random noise from the estimates.
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
add a comment |Â
up vote
3
down vote
In addition to the excellent answers provided by others, I would like to comment on how time series analyses are used in electrical engineering.
A large part of electrical engineering consists of modulating voltages and currents to transmit information, or using sensors to convert a physical signal (such as a sound wave) into an electrical form, from which a computer is expected to make a decision. An analog to digital (A/D) converter translates these signals into a set of evenly-spaced (in time) discrete samples, or a time series! Time series analysis methods are the basis of almost all modern signal processing algorithms.
For example, speech processing consists of using a microphone to convert a sound wave to an electrical voltage, which is sampled by an A/D, after which a time series model of the signal is created. For instance, linear predictive coders (LPC) in cell phones create an ARMA model of the words being spoken and transmit the model coefficients (along with an index representing an excitation signal from a pre-defined dictionary) instead of the data samples themselves to achieve data compression.
add a comment |Â
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
50
down vote
One main use is forecasting. I have been feeding my family for over a decade now by forecasting how many units of a specific product a supermarket will sell tomorrow, so he can order enough stock, but not too much. There is money in this.
Other forecasting use cases are given in publications like the International Journal of Forecasting or Foresight. (Full disclosure: I'm an Associate Editor of Foresight.)
Yes, sometimes the prediction-intervals are huge. (I assume you mean PIs, not confidence-intervals. There is a difference.) This simply means that the process is hard to forecast. Then you need to mitigate. In forecasting supermarket sales, this means you need a lot of safety stock. In forecasting sea level rises, this means you need to build higher levees. I would say that a large prediction interval does provide useful information.
And for all forecasting use cases, time-series analyis is useful, though forecasting is a larger topic. You can often improve forecasts by taking the dependencies in your time series into account, so you need to understand them through analysis, which is more specific than just knowing dependencies are there.
Plus, people are interested in time series even if they do not forecast. Econometricians like to detect change points in macroeconomic time series. Or assess the impact of an intervention, such as a change in tax laws, on GDP or something else. You may want to skim through your favorite econometrics journal for more inspiration.
13
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
add a comment |Â
up vote
50
down vote
One main use is forecasting. I have been feeding my family for over a decade now by forecasting how many units of a specific product a supermarket will sell tomorrow, so he can order enough stock, but not too much. There is money in this.
Other forecasting use cases are given in publications like the International Journal of Forecasting or Foresight. (Full disclosure: I'm an Associate Editor of Foresight.)
Yes, sometimes the prediction-intervals are huge. (I assume you mean PIs, not confidence-intervals. There is a difference.) This simply means that the process is hard to forecast. Then you need to mitigate. In forecasting supermarket sales, this means you need a lot of safety stock. In forecasting sea level rises, this means you need to build higher levees. I would say that a large prediction interval does provide useful information.
And for all forecasting use cases, time-series analyis is useful, though forecasting is a larger topic. You can often improve forecasts by taking the dependencies in your time series into account, so you need to understand them through analysis, which is more specific than just knowing dependencies are there.
Plus, people are interested in time series even if they do not forecast. Econometricians like to detect change points in macroeconomic time series. Or assess the impact of an intervention, such as a change in tax laws, on GDP or something else. You may want to skim through your favorite econometrics journal for more inspiration.
13
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
add a comment |Â
up vote
50
down vote
up vote
50
down vote
One main use is forecasting. I have been feeding my family for over a decade now by forecasting how many units of a specific product a supermarket will sell tomorrow, so he can order enough stock, but not too much. There is money in this.
Other forecasting use cases are given in publications like the International Journal of Forecasting or Foresight. (Full disclosure: I'm an Associate Editor of Foresight.)
Yes, sometimes the prediction-intervals are huge. (I assume you mean PIs, not confidence-intervals. There is a difference.) This simply means that the process is hard to forecast. Then you need to mitigate. In forecasting supermarket sales, this means you need a lot of safety stock. In forecasting sea level rises, this means you need to build higher levees. I would say that a large prediction interval does provide useful information.
And for all forecasting use cases, time-series analyis is useful, though forecasting is a larger topic. You can often improve forecasts by taking the dependencies in your time series into account, so you need to understand them through analysis, which is more specific than just knowing dependencies are there.
Plus, people are interested in time series even if they do not forecast. Econometricians like to detect change points in macroeconomic time series. Or assess the impact of an intervention, such as a change in tax laws, on GDP or something else. You may want to skim through your favorite econometrics journal for more inspiration.
One main use is forecasting. I have been feeding my family for over a decade now by forecasting how many units of a specific product a supermarket will sell tomorrow, so he can order enough stock, but not too much. There is money in this.
Other forecasting use cases are given in publications like the International Journal of Forecasting or Foresight. (Full disclosure: I'm an Associate Editor of Foresight.)
Yes, sometimes the prediction-intervals are huge. (I assume you mean PIs, not confidence-intervals. There is a difference.) This simply means that the process is hard to forecast. Then you need to mitigate. In forecasting supermarket sales, this means you need a lot of safety stock. In forecasting sea level rises, this means you need to build higher levees. I would say that a large prediction interval does provide useful information.
And for all forecasting use cases, time-series analyis is useful, though forecasting is a larger topic. You can often improve forecasts by taking the dependencies in your time series into account, so you need to understand them through analysis, which is more specific than just knowing dependencies are there.
Plus, people are interested in time series even if they do not forecast. Econometricians like to detect change points in macroeconomic time series. Or assess the impact of an intervention, such as a change in tax laws, on GDP or something else. You may want to skim through your favorite econometrics journal for more inspiration.
answered Sep 19 at 8:51
Stephan Kolassa
42k688154
42k688154
13
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
add a comment |Â
13
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
13
13
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. It is useful elsewhere as well. Doing a time series analysis will certainly alert you to events (you did not know of) that affect a result you are interested in. I and all my colleagues were completely stunned to find we generally were sliiightly worse on product chemical analyses on tuesday mornings. We tracked it back to a well intentioned cleaning schedule that had some weaknesses. We saved over the year close to a million and improved the product ppk from 1.7 to 1.9. Lesson learned: always do a rudimentary time series analysis on any variation framed problem.
â Stian Yttervik
Sep 19 at 13:33
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
+1. @Stephan Kolassa, your answer highlights how forecasts are used in the real world which is my interpretation of the OP's question. The prediction interval point and (half) length is exactly the information you care about for planning mitigation strategies. If you're building levee's to prevent Manhattan from flooding and your new time series method reduces the prediction interval by enough, you can reduce the cost of levee construction by using only the resources necessary. Mutatis mutandis applies for your grocery example.
â Lucas Roberts
Sep 22 at 15:14
add a comment |Â
up vote
12
down vote
Goals in TS Analysis from the lesson-slides of M. Dettling:
1) Exploratory Analysis:
Visualization of the properties of the series
- time series plot
- decomposition into trend/seasonal pattern/random error
- correlogram for understanding the dependency structure
2) Modeling:
Fitting a stochastic model to the data that represents and
reflects the most important properties of the series
- done exploratory or with previous knowledge
- model choice and parameter estimation is crucial
- inference: how well does the model fit the data?
3) Forecasting:
Prediction of future observations with measure of uncertainty
- mostly model based, uses dependency and past data
- is an extrapolation, thus often to take with a grain of salt
- similar to driving a car by looking in the rear window mirror
4) Process Control:
The output of a (physical) process defines a time series
- a stochastic model is fitted to observed data
- this allows understanding both signal and noise
- it is feasible to monitor normal/abnormal fluctuations
5) Time Series Regression:
Modeling response time series using 1 or more input series
Fitting this model under i.i.d error assumption:
- leads to unbiased estimates, but...
- often grossly wrong standard errors
- thus, confidence intervals and tests are misleading
About the stock marked problem:
- These TS are very volatile, which is difficult to model.
- For example a change in a law that concerns the company could lead to a change in the TS process... how would any statistical tool predict that?
About serial correlation:
- In contrast to multivariate statistics, the data in a time series
are usually not iid, but are serially correlated. - This information can also be useful to detect something to be not iid, what is supposed to be, like for example a dirty laboratory instrument
1
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
1
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
add a comment |Â
up vote
12
down vote
Goals in TS Analysis from the lesson-slides of M. Dettling:
1) Exploratory Analysis:
Visualization of the properties of the series
- time series plot
- decomposition into trend/seasonal pattern/random error
- correlogram for understanding the dependency structure
2) Modeling:
Fitting a stochastic model to the data that represents and
reflects the most important properties of the series
- done exploratory or with previous knowledge
- model choice and parameter estimation is crucial
- inference: how well does the model fit the data?
3) Forecasting:
Prediction of future observations with measure of uncertainty
- mostly model based, uses dependency and past data
- is an extrapolation, thus often to take with a grain of salt
- similar to driving a car by looking in the rear window mirror
4) Process Control:
The output of a (physical) process defines a time series
- a stochastic model is fitted to observed data
- this allows understanding both signal and noise
- it is feasible to monitor normal/abnormal fluctuations
5) Time Series Regression:
Modeling response time series using 1 or more input series
Fitting this model under i.i.d error assumption:
- leads to unbiased estimates, but...
- often grossly wrong standard errors
- thus, confidence intervals and tests are misleading
About the stock marked problem:
- These TS are very volatile, which is difficult to model.
- For example a change in a law that concerns the company could lead to a change in the TS process... how would any statistical tool predict that?
About serial correlation:
- In contrast to multivariate statistics, the data in a time series
are usually not iid, but are serially correlated. - This information can also be useful to detect something to be not iid, what is supposed to be, like for example a dirty laboratory instrument
1
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
1
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
add a comment |Â
up vote
12
down vote
up vote
12
down vote
Goals in TS Analysis from the lesson-slides of M. Dettling:
1) Exploratory Analysis:
Visualization of the properties of the series
- time series plot
- decomposition into trend/seasonal pattern/random error
- correlogram for understanding the dependency structure
2) Modeling:
Fitting a stochastic model to the data that represents and
reflects the most important properties of the series
- done exploratory or with previous knowledge
- model choice and parameter estimation is crucial
- inference: how well does the model fit the data?
3) Forecasting:
Prediction of future observations with measure of uncertainty
- mostly model based, uses dependency and past data
- is an extrapolation, thus often to take with a grain of salt
- similar to driving a car by looking in the rear window mirror
4) Process Control:
The output of a (physical) process defines a time series
- a stochastic model is fitted to observed data
- this allows understanding both signal and noise
- it is feasible to monitor normal/abnormal fluctuations
5) Time Series Regression:
Modeling response time series using 1 or more input series
Fitting this model under i.i.d error assumption:
- leads to unbiased estimates, but...
- often grossly wrong standard errors
- thus, confidence intervals and tests are misleading
About the stock marked problem:
- These TS are very volatile, which is difficult to model.
- For example a change in a law that concerns the company could lead to a change in the TS process... how would any statistical tool predict that?
About serial correlation:
- In contrast to multivariate statistics, the data in a time series
are usually not iid, but are serially correlated. - This information can also be useful to detect something to be not iid, what is supposed to be, like for example a dirty laboratory instrument
Goals in TS Analysis from the lesson-slides of M. Dettling:
1) Exploratory Analysis:
Visualization of the properties of the series
- time series plot
- decomposition into trend/seasonal pattern/random error
- correlogram for understanding the dependency structure
2) Modeling:
Fitting a stochastic model to the data that represents and
reflects the most important properties of the series
- done exploratory or with previous knowledge
- model choice and parameter estimation is crucial
- inference: how well does the model fit the data?
3) Forecasting:
Prediction of future observations with measure of uncertainty
- mostly model based, uses dependency and past data
- is an extrapolation, thus often to take with a grain of salt
- similar to driving a car by looking in the rear window mirror
4) Process Control:
The output of a (physical) process defines a time series
- a stochastic model is fitted to observed data
- this allows understanding both signal and noise
- it is feasible to monitor normal/abnormal fluctuations
5) Time Series Regression:
Modeling response time series using 1 or more input series
Fitting this model under i.i.d error assumption:
- leads to unbiased estimates, but...
- often grossly wrong standard errors
- thus, confidence intervals and tests are misleading
About the stock marked problem:
- These TS are very volatile, which is difficult to model.
- For example a change in a law that concerns the company could lead to a change in the TS process... how would any statistical tool predict that?
About serial correlation:
- In contrast to multivariate statistics, the data in a time series
are usually not iid, but are serially correlated. - This information can also be useful to detect something to be not iid, what is supposed to be, like for example a dirty laboratory instrument
edited Sep 19 at 12:50
answered Sep 19 at 12:42
Nicole Origami Fox
1216
1216
1
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
1
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
add a comment |Â
1
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
1
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
1
1
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
I would add classification, e.g., there are numerous apps that recognize your activity based on analyzing the accelerometer data of your phone.
â SaiBot
Sep 20 at 7:26
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
That's interesting! How do you do this?
â Nicole Origami Fox
Sep 20 at 8:49
1
1
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
I think there are many different ways. One way would be to let users generate training data by labeling their activity. Once you have that you could cut your timeseries in (overlapping) intervals (e.g., 3 seconds) and train a machine learning model. Afterwards you would be able to classify unlabeled activities.
â SaiBot
Sep 20 at 10:20
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
Thank you SaiBot. I see, I have to consider the combination of different tools more often :)
â Nicole Origami Fox
Sep 20 at 11:01
add a comment |Â
up vote
9
down vote
The easiest way to answer your question is to understand that roughly the data sets are often categorized as cross-sectional, time series and panel. Cross-sectional regression is a go-to tool for the cross-sectional data sets. This is what most people know and refer to with a term regression. Time series regression is sometimes applied to time-series, but time series analysis has a wide range of tools beyond the regression.
Example of cross-sectional data is $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $x_i,y_i$ are weights and heights of randomly picked students in a school. When a sample is random we can often run a linear regression $ysim x$ and get reliable results, to maybe predict height $hat y$ of a student in this school knowing only student's weight $x$.
If the sample wasn't random, then the regression may not work at all. For instance, you picked only girls in first grade to estimate the model, but you have to predict the height of a male 12th grader. So, the regression has its own issues even in the cross-sectional setup.
Now, look at time series data, it could be $x_t,y_t$ such as $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $t$ the month of a year, and $x,y$ are still weight and height but of a particular student in this school.
Generally, regression doesn't have to work at all. One reason is that indices $t$ are ordered. So your sample is not random, and I mentioned earlier that regression prefers a random sample to work properly. This is a serious issue. Time series data tend to be persistent, e.g. your height this month is highly correlated to your height next month. In order to deal with these issues time series analysis was developed, it included the regression technique too, but it has to be used in certain ways.
The third common dataset type is a panel, particularly, the one iwth longitudinal data. Here, you may get several snapshots of weight and height variables for a number of students. This dataset may look like waves of cross-sections or a set of time series.
Naturally, this can be more complicated than previous two types. Here we use panel regression and other special techniques developed for panels.
Summarizing, the reason why time series regression is considered as a distinct tool compared to cross-sectional regression is that time series present unique challenges when it comes to independence assumptions of regression technique. Particularly, due to the fact that unlike in cross-sectional analysis, the order of observations matters, it usually leads to all kinds of correlation and dependence structures, which may sometimes invalidate application of regression techniques. You have to deal with dependence, and that's exactly what time series analysis is good at.
Predictability of Asset Prices
Also, you're repeating a common misconception about stock markets and asset prices in general, that they cannot be predicted. This statement is too general to be true. It's true that you can't outright predict next tick of AAPL reliably. However, it's a very narrow problem. If you cast your net wider you'll discover a lot of opportunities to make money using for all kinds of forecasting (and time series analysis in particular). Statistical arbitrage is one such field.
Now, the reason why asset prices are hard to predict in near term is due to the fact that a large component of price changes is new information. The truly new information that cannot be realistically devised from the past is by definition impossible to predict. However, this is an idealized model, and a lot of people would argue that the anomalies exist that allow for persistence of state. This means that the part of price change can be explained by the past. In such cases time series analysis is quite appropriate because it precisely deals with persistence. It separate new from old, new is impossible to predict, but old is dragged from the past into the future. If you can explain even a little bit, in finance it means that you may be able to make money. As long as the price of the strategy built on such forecasting covers the income generated by it.
Finally, take a look at the economics nobel prize in 2013: "it is quite possible to foresee the broad course of these prices over longer periods, such as the next three to five years." Take a look at Shiller's nobel lecture, he discusses forecastability of asset prices.
add a comment |Â
up vote
9
down vote
The easiest way to answer your question is to understand that roughly the data sets are often categorized as cross-sectional, time series and panel. Cross-sectional regression is a go-to tool for the cross-sectional data sets. This is what most people know and refer to with a term regression. Time series regression is sometimes applied to time-series, but time series analysis has a wide range of tools beyond the regression.
Example of cross-sectional data is $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $x_i,y_i$ are weights and heights of randomly picked students in a school. When a sample is random we can often run a linear regression $ysim x$ and get reliable results, to maybe predict height $hat y$ of a student in this school knowing only student's weight $x$.
If the sample wasn't random, then the regression may not work at all. For instance, you picked only girls in first grade to estimate the model, but you have to predict the height of a male 12th grader. So, the regression has its own issues even in the cross-sectional setup.
Now, look at time series data, it could be $x_t,y_t$ such as $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $t$ the month of a year, and $x,y$ are still weight and height but of a particular student in this school.
Generally, regression doesn't have to work at all. One reason is that indices $t$ are ordered. So your sample is not random, and I mentioned earlier that regression prefers a random sample to work properly. This is a serious issue. Time series data tend to be persistent, e.g. your height this month is highly correlated to your height next month. In order to deal with these issues time series analysis was developed, it included the regression technique too, but it has to be used in certain ways.
The third common dataset type is a panel, particularly, the one iwth longitudinal data. Here, you may get several snapshots of weight and height variables for a number of students. This dataset may look like waves of cross-sections or a set of time series.
Naturally, this can be more complicated than previous two types. Here we use panel regression and other special techniques developed for panels.
Summarizing, the reason why time series regression is considered as a distinct tool compared to cross-sectional regression is that time series present unique challenges when it comes to independence assumptions of regression technique. Particularly, due to the fact that unlike in cross-sectional analysis, the order of observations matters, it usually leads to all kinds of correlation and dependence structures, which may sometimes invalidate application of regression techniques. You have to deal with dependence, and that's exactly what time series analysis is good at.
Predictability of Asset Prices
Also, you're repeating a common misconception about stock markets and asset prices in general, that they cannot be predicted. This statement is too general to be true. It's true that you can't outright predict next tick of AAPL reliably. However, it's a very narrow problem. If you cast your net wider you'll discover a lot of opportunities to make money using for all kinds of forecasting (and time series analysis in particular). Statistical arbitrage is one such field.
Now, the reason why asset prices are hard to predict in near term is due to the fact that a large component of price changes is new information. The truly new information that cannot be realistically devised from the past is by definition impossible to predict. However, this is an idealized model, and a lot of people would argue that the anomalies exist that allow for persistence of state. This means that the part of price change can be explained by the past. In such cases time series analysis is quite appropriate because it precisely deals with persistence. It separate new from old, new is impossible to predict, but old is dragged from the past into the future. If you can explain even a little bit, in finance it means that you may be able to make money. As long as the price of the strategy built on such forecasting covers the income generated by it.
Finally, take a look at the economics nobel prize in 2013: "it is quite possible to foresee the broad course of these prices over longer periods, such as the next three to five years." Take a look at Shiller's nobel lecture, he discusses forecastability of asset prices.
add a comment |Â
up vote
9
down vote
up vote
9
down vote
The easiest way to answer your question is to understand that roughly the data sets are often categorized as cross-sectional, time series and panel. Cross-sectional regression is a go-to tool for the cross-sectional data sets. This is what most people know and refer to with a term regression. Time series regression is sometimes applied to time-series, but time series analysis has a wide range of tools beyond the regression.
Example of cross-sectional data is $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $x_i,y_i$ are weights and heights of randomly picked students in a school. When a sample is random we can often run a linear regression $ysim x$ and get reliable results, to maybe predict height $hat y$ of a student in this school knowing only student's weight $x$.
If the sample wasn't random, then the regression may not work at all. For instance, you picked only girls in first grade to estimate the model, but you have to predict the height of a male 12th grader. So, the regression has its own issues even in the cross-sectional setup.
Now, look at time series data, it could be $x_t,y_t$ such as $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $t$ the month of a year, and $x,y$ are still weight and height but of a particular student in this school.
Generally, regression doesn't have to work at all. One reason is that indices $t$ are ordered. So your sample is not random, and I mentioned earlier that regression prefers a random sample to work properly. This is a serious issue. Time series data tend to be persistent, e.g. your height this month is highly correlated to your height next month. In order to deal with these issues time series analysis was developed, it included the regression technique too, but it has to be used in certain ways.
The third common dataset type is a panel, particularly, the one iwth longitudinal data. Here, you may get several snapshots of weight and height variables for a number of students. This dataset may look like waves of cross-sections or a set of time series.
Naturally, this can be more complicated than previous two types. Here we use panel regression and other special techniques developed for panels.
Summarizing, the reason why time series regression is considered as a distinct tool compared to cross-sectional regression is that time series present unique challenges when it comes to independence assumptions of regression technique. Particularly, due to the fact that unlike in cross-sectional analysis, the order of observations matters, it usually leads to all kinds of correlation and dependence structures, which may sometimes invalidate application of regression techniques. You have to deal with dependence, and that's exactly what time series analysis is good at.
Predictability of Asset Prices
Also, you're repeating a common misconception about stock markets and asset prices in general, that they cannot be predicted. This statement is too general to be true. It's true that you can't outright predict next tick of AAPL reliably. However, it's a very narrow problem. If you cast your net wider you'll discover a lot of opportunities to make money using for all kinds of forecasting (and time series analysis in particular). Statistical arbitrage is one such field.
Now, the reason why asset prices are hard to predict in near term is due to the fact that a large component of price changes is new information. The truly new information that cannot be realistically devised from the past is by definition impossible to predict. However, this is an idealized model, and a lot of people would argue that the anomalies exist that allow for persistence of state. This means that the part of price change can be explained by the past. In such cases time series analysis is quite appropriate because it precisely deals with persistence. It separate new from old, new is impossible to predict, but old is dragged from the past into the future. If you can explain even a little bit, in finance it means that you may be able to make money. As long as the price of the strategy built on such forecasting covers the income generated by it.
Finally, take a look at the economics nobel prize in 2013: "it is quite possible to foresee the broad course of these prices over longer periods, such as the next three to five years." Take a look at Shiller's nobel lecture, he discusses forecastability of asset prices.
The easiest way to answer your question is to understand that roughly the data sets are often categorized as cross-sectional, time series and panel. Cross-sectional regression is a go-to tool for the cross-sectional data sets. This is what most people know and refer to with a term regression. Time series regression is sometimes applied to time-series, but time series analysis has a wide range of tools beyond the regression.
Example of cross-sectional data is $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $x_i,y_i$ are weights and heights of randomly picked students in a school. When a sample is random we can often run a linear regression $ysim x$ and get reliable results, to maybe predict height $hat y$ of a student in this school knowing only student's weight $x$.
If the sample wasn't random, then the regression may not work at all. For instance, you picked only girls in first grade to estimate the model, but you have to predict the height of a male 12th grader. So, the regression has its own issues even in the cross-sectional setup.
Now, look at time series data, it could be $x_t,y_t$ such as $(x_1,y_1),(x_2,y_3),dots,(x_n,y_n)$, where $t$ the month of a year, and $x,y$ are still weight and height but of a particular student in this school.
Generally, regression doesn't have to work at all. One reason is that indices $t$ are ordered. So your sample is not random, and I mentioned earlier that regression prefers a random sample to work properly. This is a serious issue. Time series data tend to be persistent, e.g. your height this month is highly correlated to your height next month. In order to deal with these issues time series analysis was developed, it included the regression technique too, but it has to be used in certain ways.
The third common dataset type is a panel, particularly, the one iwth longitudinal data. Here, you may get several snapshots of weight and height variables for a number of students. This dataset may look like waves of cross-sections or a set of time series.
Naturally, this can be more complicated than previous two types. Here we use panel regression and other special techniques developed for panels.
Summarizing, the reason why time series regression is considered as a distinct tool compared to cross-sectional regression is that time series present unique challenges when it comes to independence assumptions of regression technique. Particularly, due to the fact that unlike in cross-sectional analysis, the order of observations matters, it usually leads to all kinds of correlation and dependence structures, which may sometimes invalidate application of regression techniques. You have to deal with dependence, and that's exactly what time series analysis is good at.
Predictability of Asset Prices
Also, you're repeating a common misconception about stock markets and asset prices in general, that they cannot be predicted. This statement is too general to be true. It's true that you can't outright predict next tick of AAPL reliably. However, it's a very narrow problem. If you cast your net wider you'll discover a lot of opportunities to make money using for all kinds of forecasting (and time series analysis in particular). Statistical arbitrage is one such field.
Now, the reason why asset prices are hard to predict in near term is due to the fact that a large component of price changes is new information. The truly new information that cannot be realistically devised from the past is by definition impossible to predict. However, this is an idealized model, and a lot of people would argue that the anomalies exist that allow for persistence of state. This means that the part of price change can be explained by the past. In such cases time series analysis is quite appropriate because it precisely deals with persistence. It separate new from old, new is impossible to predict, but old is dragged from the past into the future. If you can explain even a little bit, in finance it means that you may be able to make money. As long as the price of the strategy built on such forecasting covers the income generated by it.
Finally, take a look at the economics nobel prize in 2013: "it is quite possible to foresee the broad course of these prices over longer periods, such as the next three to five years." Take a look at Shiller's nobel lecture, he discusses forecastability of asset prices.
edited Sep 21 at 14:39
answered Sep 19 at 14:09
Aksakal
36.8k447106
36.8k447106
add a comment |Â
add a comment |Â
up vote
6
down vote
Time series analysis can also contribute to effective anomaly or outlier detection in temporal data.
As an example, it is possible to fit an ARIMA model and calculate a forecast interval. Depending on the use case, the interval can be used to set a threshold, within which the process can be said to be in control; if new data falls outside the threshold it is flagged for further attention.
This blog post has a brief and broad overview of time series analysis for outlier detection. For a more in-depth treatment, researchers at ebay explain how they carried out anomaly detection at scale based on the statistical analysis of time series data.
add a comment |Â
up vote
6
down vote
Time series analysis can also contribute to effective anomaly or outlier detection in temporal data.
As an example, it is possible to fit an ARIMA model and calculate a forecast interval. Depending on the use case, the interval can be used to set a threshold, within which the process can be said to be in control; if new data falls outside the threshold it is flagged for further attention.
This blog post has a brief and broad overview of time series analysis for outlier detection. For a more in-depth treatment, researchers at ebay explain how they carried out anomaly detection at scale based on the statistical analysis of time series data.
add a comment |Â
up vote
6
down vote
up vote
6
down vote
Time series analysis can also contribute to effective anomaly or outlier detection in temporal data.
As an example, it is possible to fit an ARIMA model and calculate a forecast interval. Depending on the use case, the interval can be used to set a threshold, within which the process can be said to be in control; if new data falls outside the threshold it is flagged for further attention.
This blog post has a brief and broad overview of time series analysis for outlier detection. For a more in-depth treatment, researchers at ebay explain how they carried out anomaly detection at scale based on the statistical analysis of time series data.
Time series analysis can also contribute to effective anomaly or outlier detection in temporal data.
As an example, it is possible to fit an ARIMA model and calculate a forecast interval. Depending on the use case, the interval can be used to set a threshold, within which the process can be said to be in control; if new data falls outside the threshold it is flagged for further attention.
This blog post has a brief and broad overview of time series analysis for outlier detection. For a more in-depth treatment, researchers at ebay explain how they carried out anomaly detection at scale based on the statistical analysis of time series data.
answered Sep 19 at 9:45
redhqs
1613
1613
add a comment |Â
add a comment |Â
up vote
6
down vote
There are plenty of other statistical methods, such as regression and
machine learning, that have obvious use cases: regression can provide
information on the relationship between two variables, while machine
learning is great for prediction.
You answer your own question, below: autocorrelation. Time series usually have it, violating an assumption of basic OLS regression. Time series techniques have the appropriate assumptions for time series.
Machine learning methods which deal with sequential data are specialized, like recurrent neural networks (RNNs) or 1-D convolutional neural networks (CNNs), so you still have specialized techniques for time series.
But meanwhile, I don't see what time series analysis is good for.
Sure, I can fit an ARIMA model and use it for prediction, but what
good is that when the confidence intervals for that prediction are
going to be huge? There's a reason nobody can predict the stock market
despite it being the most data-driven industry in world history.
Confidence intervals (CIs) resulting from a time series technique will probably be larger than those from a non-time-series regression. This feature is known as being accurate. In general, when you use a non-time-series regression your CI will be smaller but it is incorrect because you've violated its assumptions. If all you want to do is present a graph with small CI's make them up or skip the CI's entirely, but if you want appropriate CIs use appropriate techniques.
The stock market is hard to predict because of it's nature. Other time series are much more predictable. Try using your machine learning technique of choice in the stock market and I doubt you'll have more success.
Likewise, how do I use it to understand my process further? Sure, I
can plot the ACF and go "aha! there's some dependence!", but then
what? What's the point? Of course there's dependence, that's why you
are doing time series analysis to begin with. You already knew there
was dependence. But what are you going to use it for?
To predict. To see seasonality. To have an idea about the variability of the data over different seasons. Not to mention that there are more powerful time series techniques than old-school ARIMA, like State Space methods. ARIMA is not the best technique for modeling time series. (In fact, the ARIMA procedure in your statistical software of choice is probably using a State Space representation under the hood.)
add a comment |Â
up vote
6
down vote
There are plenty of other statistical methods, such as regression and
machine learning, that have obvious use cases: regression can provide
information on the relationship between two variables, while machine
learning is great for prediction.
You answer your own question, below: autocorrelation. Time series usually have it, violating an assumption of basic OLS regression. Time series techniques have the appropriate assumptions for time series.
Machine learning methods which deal with sequential data are specialized, like recurrent neural networks (RNNs) or 1-D convolutional neural networks (CNNs), so you still have specialized techniques for time series.
But meanwhile, I don't see what time series analysis is good for.
Sure, I can fit an ARIMA model and use it for prediction, but what
good is that when the confidence intervals for that prediction are
going to be huge? There's a reason nobody can predict the stock market
despite it being the most data-driven industry in world history.
Confidence intervals (CIs) resulting from a time series technique will probably be larger than those from a non-time-series regression. This feature is known as being accurate. In general, when you use a non-time-series regression your CI will be smaller but it is incorrect because you've violated its assumptions. If all you want to do is present a graph with small CI's make them up or skip the CI's entirely, but if you want appropriate CIs use appropriate techniques.
The stock market is hard to predict because of it's nature. Other time series are much more predictable. Try using your machine learning technique of choice in the stock market and I doubt you'll have more success.
Likewise, how do I use it to understand my process further? Sure, I
can plot the ACF and go "aha! there's some dependence!", but then
what? What's the point? Of course there's dependence, that's why you
are doing time series analysis to begin with. You already knew there
was dependence. But what are you going to use it for?
To predict. To see seasonality. To have an idea about the variability of the data over different seasons. Not to mention that there are more powerful time series techniques than old-school ARIMA, like State Space methods. ARIMA is not the best technique for modeling time series. (In fact, the ARIMA procedure in your statistical software of choice is probably using a State Space representation under the hood.)
add a comment |Â
up vote
6
down vote
up vote
6
down vote
There are plenty of other statistical methods, such as regression and
machine learning, that have obvious use cases: regression can provide
information on the relationship between two variables, while machine
learning is great for prediction.
You answer your own question, below: autocorrelation. Time series usually have it, violating an assumption of basic OLS regression. Time series techniques have the appropriate assumptions for time series.
Machine learning methods which deal with sequential data are specialized, like recurrent neural networks (RNNs) or 1-D convolutional neural networks (CNNs), so you still have specialized techniques for time series.
But meanwhile, I don't see what time series analysis is good for.
Sure, I can fit an ARIMA model and use it for prediction, but what
good is that when the confidence intervals for that prediction are
going to be huge? There's a reason nobody can predict the stock market
despite it being the most data-driven industry in world history.
Confidence intervals (CIs) resulting from a time series technique will probably be larger than those from a non-time-series regression. This feature is known as being accurate. In general, when you use a non-time-series regression your CI will be smaller but it is incorrect because you've violated its assumptions. If all you want to do is present a graph with small CI's make them up or skip the CI's entirely, but if you want appropriate CIs use appropriate techniques.
The stock market is hard to predict because of it's nature. Other time series are much more predictable. Try using your machine learning technique of choice in the stock market and I doubt you'll have more success.
Likewise, how do I use it to understand my process further? Sure, I
can plot the ACF and go "aha! there's some dependence!", but then
what? What's the point? Of course there's dependence, that's why you
are doing time series analysis to begin with. You already knew there
was dependence. But what are you going to use it for?
To predict. To see seasonality. To have an idea about the variability of the data over different seasons. Not to mention that there are more powerful time series techniques than old-school ARIMA, like State Space methods. ARIMA is not the best technique for modeling time series. (In fact, the ARIMA procedure in your statistical software of choice is probably using a State Space representation under the hood.)
There are plenty of other statistical methods, such as regression and
machine learning, that have obvious use cases: regression can provide
information on the relationship between two variables, while machine
learning is great for prediction.
You answer your own question, below: autocorrelation. Time series usually have it, violating an assumption of basic OLS regression. Time series techniques have the appropriate assumptions for time series.
Machine learning methods which deal with sequential data are specialized, like recurrent neural networks (RNNs) or 1-D convolutional neural networks (CNNs), so you still have specialized techniques for time series.
But meanwhile, I don't see what time series analysis is good for.
Sure, I can fit an ARIMA model and use it for prediction, but what
good is that when the confidence intervals for that prediction are
going to be huge? There's a reason nobody can predict the stock market
despite it being the most data-driven industry in world history.
Confidence intervals (CIs) resulting from a time series technique will probably be larger than those from a non-time-series regression. This feature is known as being accurate. In general, when you use a non-time-series regression your CI will be smaller but it is incorrect because you've violated its assumptions. If all you want to do is present a graph with small CI's make them up or skip the CI's entirely, but if you want appropriate CIs use appropriate techniques.
The stock market is hard to predict because of it's nature. Other time series are much more predictable. Try using your machine learning technique of choice in the stock market and I doubt you'll have more success.
Likewise, how do I use it to understand my process further? Sure, I
can plot the ACF and go "aha! there's some dependence!", but then
what? What's the point? Of course there's dependence, that's why you
are doing time series analysis to begin with. You already knew there
was dependence. But what are you going to use it for?
To predict. To see seasonality. To have an idea about the variability of the data over different seasons. Not to mention that there are more powerful time series techniques than old-school ARIMA, like State Space methods. ARIMA is not the best technique for modeling time series. (In fact, the ARIMA procedure in your statistical software of choice is probably using a State Space representation under the hood.)
edited Sep 20 at 13:42
answered Sep 19 at 22:18
Wayne
15.5k13572
15.5k13572
add a comment |Â
add a comment |Â
up vote
5
down vote
To add some color to the anomaly detection answer by redhqs, at work I build anomaly detection models for operational metrics like sales and traffic flows. We do the time series analysis to understand what sales ought to be if everything is working as expected, and then compare these to the observed values to see whether the web site is broken. It's important because for every minute the site is down we are losing a lot of money.
There are different methods you can use, and different methods try to accomplish different things in many cases. For example the main statistical method I have used for sales anomaly detection is called "STL" (seasonal-trend decomposition using loess). This separates the regular seasonality, trend and random noise. We actually use this to identify both daily and weekly seasonalities. Then we throw out the noise, and combine the trend/seasonalities to estimate expected sales. So in our case we use the approach to understand how sales vary with time of day and time of week, and to exclude random noise from the estimates.
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
add a comment |Â
up vote
5
down vote
To add some color to the anomaly detection answer by redhqs, at work I build anomaly detection models for operational metrics like sales and traffic flows. We do the time series analysis to understand what sales ought to be if everything is working as expected, and then compare these to the observed values to see whether the web site is broken. It's important because for every minute the site is down we are losing a lot of money.
There are different methods you can use, and different methods try to accomplish different things in many cases. For example the main statistical method I have used for sales anomaly detection is called "STL" (seasonal-trend decomposition using loess). This separates the regular seasonality, trend and random noise. We actually use this to identify both daily and weekly seasonalities. Then we throw out the noise, and combine the trend/seasonalities to estimate expected sales. So in our case we use the approach to understand how sales vary with time of day and time of week, and to exclude random noise from the estimates.
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
add a comment |Â
up vote
5
down vote
up vote
5
down vote
To add some color to the anomaly detection answer by redhqs, at work I build anomaly detection models for operational metrics like sales and traffic flows. We do the time series analysis to understand what sales ought to be if everything is working as expected, and then compare these to the observed values to see whether the web site is broken. It's important because for every minute the site is down we are losing a lot of money.
There are different methods you can use, and different methods try to accomplish different things in many cases. For example the main statistical method I have used for sales anomaly detection is called "STL" (seasonal-trend decomposition using loess). This separates the regular seasonality, trend and random noise. We actually use this to identify both daily and weekly seasonalities. Then we throw out the noise, and combine the trend/seasonalities to estimate expected sales. So in our case we use the approach to understand how sales vary with time of day and time of week, and to exclude random noise from the estimates.
To add some color to the anomaly detection answer by redhqs, at work I build anomaly detection models for operational metrics like sales and traffic flows. We do the time series analysis to understand what sales ought to be if everything is working as expected, and then compare these to the observed values to see whether the web site is broken. It's important because for every minute the site is down we are losing a lot of money.
There are different methods you can use, and different methods try to accomplish different things in many cases. For example the main statistical method I have used for sales anomaly detection is called "STL" (seasonal-trend decomposition using loess). This separates the regular seasonality, trend and random noise. We actually use this to identify both daily and weekly seasonalities. Then we throw out the noise, and combine the trend/seasonalities to estimate expected sales. So in our case we use the approach to understand how sales vary with time of day and time of week, and to exclude random noise from the estimates.
edited Sep 19 at 13:52
answered Sep 19 at 13:45
Willie Wheeler
429315
429315
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
add a comment |Â
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
It appears that you build time series models that presume no anomalies thus may not be robust as contrasted to model identification strategies that explicitly allow for the simultaneously identification of both the SARIMA prtion and the waiting to be discovered Intervention Structure ( Pulses, level/step shifts,seasonal pulses,local time trends )
â IrishStat
Sep 19 at 15:57
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
The STL algo has a robustness flag (it controls the number of iterations of a smoothing loop). It works very well for our time series anyway.
â Willie Wheeler
Sep 20 at 0:44
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
# of iterations doesn't deal with the bias in model form if there is deterministic structure present waiting to be discovered.
â IrishStat
Sep 20 at 1:41
add a comment |Â
up vote
3
down vote
In addition to the excellent answers provided by others, I would like to comment on how time series analyses are used in electrical engineering.
A large part of electrical engineering consists of modulating voltages and currents to transmit information, or using sensors to convert a physical signal (such as a sound wave) into an electrical form, from which a computer is expected to make a decision. An analog to digital (A/D) converter translates these signals into a set of evenly-spaced (in time) discrete samples, or a time series! Time series analysis methods are the basis of almost all modern signal processing algorithms.
For example, speech processing consists of using a microphone to convert a sound wave to an electrical voltage, which is sampled by an A/D, after which a time series model of the signal is created. For instance, linear predictive coders (LPC) in cell phones create an ARMA model of the words being spoken and transmit the model coefficients (along with an index representing an excitation signal from a pre-defined dictionary) instead of the data samples themselves to achieve data compression.
add a comment |Â
up vote
3
down vote
In addition to the excellent answers provided by others, I would like to comment on how time series analyses are used in electrical engineering.
A large part of electrical engineering consists of modulating voltages and currents to transmit information, or using sensors to convert a physical signal (such as a sound wave) into an electrical form, from which a computer is expected to make a decision. An analog to digital (A/D) converter translates these signals into a set of evenly-spaced (in time) discrete samples, or a time series! Time series analysis methods are the basis of almost all modern signal processing algorithms.
For example, speech processing consists of using a microphone to convert a sound wave to an electrical voltage, which is sampled by an A/D, after which a time series model of the signal is created. For instance, linear predictive coders (LPC) in cell phones create an ARMA model of the words being spoken and transmit the model coefficients (along with an index representing an excitation signal from a pre-defined dictionary) instead of the data samples themselves to achieve data compression.
add a comment |Â
up vote
3
down vote
up vote
3
down vote
In addition to the excellent answers provided by others, I would like to comment on how time series analyses are used in electrical engineering.
A large part of electrical engineering consists of modulating voltages and currents to transmit information, or using sensors to convert a physical signal (such as a sound wave) into an electrical form, from which a computer is expected to make a decision. An analog to digital (A/D) converter translates these signals into a set of evenly-spaced (in time) discrete samples, or a time series! Time series analysis methods are the basis of almost all modern signal processing algorithms.
For example, speech processing consists of using a microphone to convert a sound wave to an electrical voltage, which is sampled by an A/D, after which a time series model of the signal is created. For instance, linear predictive coders (LPC) in cell phones create an ARMA model of the words being spoken and transmit the model coefficients (along with an index representing an excitation signal from a pre-defined dictionary) instead of the data samples themselves to achieve data compression.
In addition to the excellent answers provided by others, I would like to comment on how time series analyses are used in electrical engineering.
A large part of electrical engineering consists of modulating voltages and currents to transmit information, or using sensors to convert a physical signal (such as a sound wave) into an electrical form, from which a computer is expected to make a decision. An analog to digital (A/D) converter translates these signals into a set of evenly-spaced (in time) discrete samples, or a time series! Time series analysis methods are the basis of almost all modern signal processing algorithms.
For example, speech processing consists of using a microphone to convert a sound wave to an electrical voltage, which is sampled by an A/D, after which a time series model of the signal is created. For instance, linear predictive coders (LPC) in cell phones create an ARMA model of the words being spoken and transmit the model coefficients (along with an index representing an excitation signal from a pre-defined dictionary) instead of the data samples themselves to achieve data compression.
answered Sep 20 at 18:52
Carlos Danger
864
864
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f367590%2fwhats-the-point-of-time-series-analysis%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
13
There are other use cases apart from finance and economics where they work fine.
â user2974951
Sep 19 at 8:18
28
You cant predict stock market using other statistical & machine learning methods either, does this make them useless as well..?
â Timâ¦
Sep 19 at 9:53
15
You seem to imply ARIMA is not a form of regression. It is.
â Firebug
Sep 19 at 12:24
9
I think this is a good question, even if the answer seems obvious to experts.
â gungâ¦
Sep 19 at 20:19
6
I beg to differ with @gung and others, because the least amount of research effort would answer it.
â whuberâ¦
Sep 20 at 1:20