Untold reality of P – Hacking in Finance: Is your data valid?


P – Hacking in Finance is a new threat? How can statistical data turn on to be invalid? We answer these and other questions in our today’s discussion ‘Untold reality of P – Hacking in Finance’.

AtoZForex Nowadays, people use the statistical data in their everyday life – whether it is a decision to buy a new fridge, or to invest money in Forex. The human mind is used to be reliant on the ‘statistics’ that are provided by numerous organizations and firms. However, does this data actually reflect the reality?

Most of the empirical research in finance is likely false?

P - Hacking in FinanceCampbell Russell Harvey, a Canadian economist and a professor at Duke University’s Fuqua School of Business, believes that there is a trend to ‘torture’ the data until it confesses. Speaking differently, sometimes researchers conduct multiple tests on the data in order to find something that can be claimed to be statistically significant. Mr. Harvey has further stated:

“Unfortunately, our standard testing methods are often ill-equipped to answer the questions that we pose. We are not salespeople. We are scientists!”

Such approach may be of a relative significance in the world outside the financial industry. However, when this issue is stepping into the financial statistics field, many could face a big problem. For instance, such products are exchange-traded funds are formed using the same academic statistical techniques. Mr. Harvey wrote in his paper with a colleague from Duke, Yan Liu in 2014:

“Most of the empirical research in finance is likely false. This implies that half the financial products (promising outperformance) that companies are selling to clients are false.”

The key here is to find out where the start of the problem lies. Specifically, Mr. Harvey believes that the core of the issue is that it is difficult to bear the market, yet, people try anyway.

Backtesting: Will it help you?

People currently have a variety of computing power available to them and they can test hundreds of trading strategies. One of such tests called backtesting. Backtesting allows you to see how the strategy would have performed if it had been implemented during the up/down trends in the market over, say, the past couple of decades. As for the quality check, this technique is also tested on a different set of ‘out-of-sample’ data. Such data refers to the information that was not used to develop the technique.

Data torturing example

Nevertheless, in the wrong hands, this method would be a harmful weapon. The pattern is simple. Randall Munroe and his webcomic xkcd perfectly describe the data torturing process. It pictures a woman stating that jelly beans are causing acne. The test is performed to test the hypothesis. However, it shows no evidence. Then, a woman is changing the initial claim – she now states that it all depends on the flavor of a jelly bean. The statistician checks 20 different flavors, and nineteen of them show nothing. By occurrence, these is a correlation between one particular jelly bean flavor and acne breakouts.

In the end, the cartoon displays the front page of the newspaper with the following written: “Green Jelly Beans Linked to Acne! 95% Confidence. Only 5% Chance of Coincidence!”

P - Hacking in Finance

Researchers might use the same pattern in the financial statistics. Scary right? Statisticians might twist the process of testing. The can change the period, the set of assets, or sometimes the statistical method. Afterward, negative findings are set aside, and positive are going to be submitted to a journal or used for ETF formation.

Here one can say: What about testing the out-of-sample data? Yes, it does help to keep yourself honest. However, it does not eliminate the problem, since you will also reach the exact result you want with enough test performed.

Untold reality of P – Hacking in Finance

P - Hacking in FinanceCampbell Harvey calls this technique of ‘torturing’ the data ‘p-hacking’. The name is referring to the p-value, which is a measure of statistical significance. This technique is also known to the world as overfitting, data-snooping, or data-mining, according to Andrew Lo, the director of MIT’s Laboratory of Financial Engineering. Mr. Lo further explains:

“The more you search over the past, the more likely it is you are going to find exotic patterns that you happen to like or focus on. Those patterns are least likely to repeat.”

Mr. Harvey believes that finance is lagging behind other sectors in masking sure that its findings are actually valid. Furthermore, he says:

“Many in our profession, including me, have subjected data to inadequate tests in the past.“

He believes that people need to accept this fact in order for the field to develop further.

Have you ever come across P – Hacking in Finance sector? Tell us about your experience in the comments section below.

Share Your Opinion, Write a Comment