Scientific Anathema: The Black Box

The existence of undisclosed or ‘black box’ techniques have long been with us. Black box techniques have historically been offered either to preserve a technological secret, or to provide a cloak in which deception and exaggeration may be perpetrated. One motivation comes from the very understandable desire to protect a technological advantage from others, while the other springs from a desire to take advantage of others. Whenever some vendor comes forward with a technology that they are unwilling to adequately explain, the user or client must wonder which of the above motivations are in play. In the realm of exploration- a scientific realm where we often take on the role of specialists who give investment advice- the acceptance of any ‘black box’ technique would seem foolish. Indeed, why would anyone ever accept without question an unexplained product no matter what its role or function? Despite the obviousness of this question (and it has been asked many times before), we continue to be assailed by black box techniques. The last part of this note illustrates an example of a recent experience with one.

Introduction: the scientific method

The scientific method is a relatively recent approach to dealing with the world. It has allowed our culture to flourish, our achievements to accelerate. The scientific method is an approach to learning and understanding. At the heart of the method is the creation of a conceptual framework or theory that explains a method or observation. This theory is used to create predictions or hypotheses about the world or method. The really courageous thing about the scientific method is that these hypotheses are put to a series of tests to determine if they hold up. As long as the hypothetical predictions survive the crucible of repeated testing, the theory is maintained.

This rigorous approach to understanding the world has helped mankind to root out bad theories and create better ones. Always there is the crucible of the test, and the willingness to sacrifice the idea on the altar of truth. The importance of the prediction and the test are so great that without them, ideas cannot be considered scientific. Where does this leave us with black box techniques?

If a product is introduced, but the conceptual framework behind it is kept secret, it is a black box. The concept of not disclosing the theory behind a device is anathema to the scientific method. Without the theory, how can appropriate hypothetical tests be created with which to test the black box? If no testing can occur, how can the approach be considered scientific? It cannot.

Reasons for the continued existence of black box techniques

Despite the widespread acceptance of the scientific method, black box techniques or products continue to thrive. One of the reasons for this is the innate emotional nature of human beings. Despite our understanding of the rational, purposeful approach of the scientific method, we have retained the emotional need to ‘believe’, even when we should test. Sometimes this need to believe is fueled by lack of discipline, self-delusion or even greed. Sometimes the more beautiful human trait of dreaming, and of keeping a mind open to possibilities and greater things is the agent of our belief. The scientific method does not, however, require a closed mind, but rather a mind ready to test any idea.

Not all persons who put forth a black box product are charlatans or tricksters. Some inventors seek to protect a novel, but easily reproducible technique. It is understandable that innovators should desire to solely profit from the products of their labor. Despite the sympathy that we might have for such an inventor, it does not change the fact that if not enough information about their device is disclosed, we may be denied the ability to adequately test it and hence not be able to treat it scientifically. In the great position of responsibility that geoscientists have in providing investment advice, how could we accept such a device? Despite the seeming strength of this argument against black box technology, we are often confronted by an unexplained technique that is argued will benefit us greatly.

Case study example: imaging / resolution technology

A geologist from a small company recently contacted me with questions regarding some new resolution and imaging enhancement technique. I had never heard of it, but made enquiries. Eventually I found a representative of the company, which we shall call company ‘X’. X’s representative could not and would not explain how their process worked, but he made the following claims:

The product would give increased resolution to the data. From a bandwidth of about 80 Hz, I might get up to 200 Hz. This would even be the case for band limited Vibroseis data.
This increased resolution would come as part of some novel, high-resolution velocity/imaging algorithm that was apparently hardwired into a special computer system.

I sent over a 2-D seismic line that had three wells drilled into it, two of them this year with a full suite of logs. This dataset was also band limited Vibroseis data, with a high frequency cutoff of 96 Hz. This line had been conventionally processed by several different processors, so its bandwidth and signal to noise characteristics were well known. Within a few weeks, I received the result from company X. The data that I got appeared to be very high in frequency content and very continuous. X’s representative showed me filter panels that he argued proved that I had spatially continuous frequency information beyond 160 Hz, and perhaps as high as 200 Hz.

I was perplexed by this result and asked how they could have achieved frequency content beyond the Vibroseis sweep bandwidth. The reply was uninformative. A reference was made to the use of harmonics, and no further explanation of the technique itself.

This is the kind of black box problem that geoscientists in our industry are sometimes faced with: an apparently fantastic result from an unexplained technology. To the casual inspection, the result was so far superior in resolution to what had been achieved in the area and what the source should be capable of giving, that suspicions of its veracity were high. Nevertheless, even the remote possibility that X was producing valid results was intriguing. The question was: how do I determine if the technique is valid when X will not disclose the theory to me?

Test Setup

Since I had two new log suites (well A and well B for confidentiality), I had the ability to test the resolution claims of the process. The imaging and velocity analysis data were not disclosed to me, so I could not test that aspect of X’s work. The plan for testing the resolution improvement was through the well-known cross correlation technique between the log synthetics and the processed seismic data. To ensure best results, a mild stretch was applied to the logs. Figures 1 and 2 show a conventionally processed version of the test line with the stretched synthetic ties. The tie is very good. Figures 3 and 4 show the correlation that the stretched synthetics give to the conventional data. The tie is excellent, and the stretch appeared to be appropriate. The synthetics were sampled at 0.25ms sample rate and used the sonic and density curves. The borehole condition was excellent in both wells, and the log curves required almost no editing for quality. This sampling and data quality is sufficient to support synthetics with the same frequency content that X boasts in their processed results.

Fig. 01 — Figure 1: 70/80 Hz Synthetic from Well A Tied to Standard Processed 2-D Line.

Fig. 02 — Figure 2: 70/80 Hz Synthetic from Well B Tied to Standard Processed 2-D Line.

Company X’s data was filtered to create three datasets:

Original X dataset claiming up to 200 Hz frequency content.
Aversion of X dataset filtered back to 5/15-80/100 Hz, the bandwidth of the dataset under conventional processing.
Aversion of X dataset filtered at 80/100-170/200 Hz, isolating the part of the spectrum that X claims its processing imaged where conventional methods could not.

The hypothesis being tested is: If X has imaged meaningful information beyond the standard area bandwidth, then that information should correlate to an appropriately stretched well sampled synthetic. Since we already know that the stretch and sampling of the synthetic produces a very good tie to conventional data, we expect that this correlation test is valid. The use of two such synthetic ties in this test should add to the meaningfulness of the test. If X’s high frequency data has no correlation to the synthetic, then we must conclude that this data is meaningless, misleading and invalid. If the high frequency data does tie, then X’s (high frequency) work may be meaningful.

Fig. 04 & 05 — Figure 3 (L): Correlation of Well A Synthetic to Standard Processed Data.
Figure 4 (R): Correlation of Well B Synthetic to Standard Processed Data.

Figures 5 and 6 show the ties between the two synthetics and X’s unfiltered results. To the casual observer there appear to be places where the correlation looks good, and places where it does not. The high frequency data appears to be very continuous and signal-like. Figure 7 shows the cross correlation of the high frequency synthetics to the unfiltered data. The correlation is significant, although noisy. The lack of compression of the central peak suggests that the correlation exists only for the lower bandwidth.

Fig. 03 — Figure 5: 150/200 Hz Synthetic from Well A Tied to X’s Processed 2-D Line.

Fig. 06 — Figure 6: 150/200 Hz Synthetic from Well B Tied to X’s Processed 2-D Line.

Let us look at the high cut (filtered back) version of X’s work. The correlations are shown in figures 8 and 9. They are excellent, being quite comparable to the correlation achieved with conventional processing.

Lastly, let us look at the correlation with the low cut or high frequency only version of X’s work. The correlations are shown in figures 10 and 11. The synthetics and this data are uncorrelated. There is no meaningful correlation whatsoever. The high frequency information that X produced and claimed was meaningful earth reflection information is clearly invalid. It has failed the hypothesis testing.

Fig. 07 — Figure 7: Correlation of Both (10/15 - 160/200 Hz) Synthetics to Unfiltered X’s Data.

This test seems to indicate that X’s claims of higher frequencies are false. This work was shown to X’s representative, who could offer no explanation for the lack of correlation of the higher frequencies. It is impossible to know from this test what it was about X’s technique that did not work on this dataset. We know that in this test the high frequency claims are false, but not why. X’s method apparently uses some high resolution velocity imaging or migration algorithm, which we cannot comment on. X did not produce its apparently highly resolved velocity field or any quality control plots for the imager at all. The continuous signal-like look of the false high frequency data cannot be explained because nothing about the method is adequately explained. This false, continuous signal is a most troubling mystery as it may fool the unwary into believing that it is meaningful.

Fig. 08 — Figure 8: Correlation of Well A synthetic to 5/15-80/100 Hz Bandpass of X’s Data.

Fig. 09 — Figure 9: Correlation of Well B Synthetic to 5/15-80/100 Hz Bandpass of X’s Data.

Was this test scientific?

This test was scientifically clumsy. We do not know the theoretic framework. All that we have been able to test is one of X’s claims (the high frequency claims), and even that was only done on one dataset. It would be far better to test several datasets before drawing universal conclusions about X. Nevertheless, the dataset that we did have was a good one because the logs were high quality with very good correlations to the sweep bandwidth of the seismic data. That X would ever claim that they could effectively double the resolution of the bandwidth of the Vibroseis sweep certainly required at least this one test. X’s intractable lack of explanation of their algorithm and lack of quality control data close the door on further work with them, and leave me with feelings of suspicion. With no theoretic framework, no quality control data, and a failed test, it becomes almost as impossible to take X seriously as it is to form a true scientific test of their mysterious process.

Fig. 10 & 11 — Figure 10 (L): Correlation of Well A Synthetic to 80/100-170/200 Hz Bandpass of X’s Data.
Figure 11 (R): Correlation of Well B synthetic to 80/100-170/200 Hz Bandpass of X’s Data.

If X is an honest company- and we cannot know for sure- then this situation is indeed a sad one. The lack of disclosure prevents us from being able to help X with whatever element of their method is not working as advertised. There may even remain some unrevealed element of genius to what they are doing, but it is wasted by secrecy. Black box approaches (such as X’s here) shut the larger scientific community away from the normal developmental cycle of technology, leaving the practitioners vulnerable to falling into an isolated technical backwater.

Conclusions

Unexplained, or black box technologies are not a part of a scientific methodology. If a method is so shrouded in mystery that a theoretic framework cannot be constructed, let alone appropriate testing hypotheses, then the application of the method cannot be called scientific. Responsible earth scientists cannot and should not be persuaded to use any technology that is not adequately explained to them. In the case study where company X made claims of fantastic resolution without explanation, it was shown that the claims were false (at least in the example dataset). The danger of X’s work is that their data appears to be valid to the casual observer. It looks like continuous signal. The high frequency signal in this example is misleading and erroneous. The case study of X’s work does not help with our suspicions of black box approaches. Not only is the black box approach a rejection of the scientific method, it is a rejection of the scientific community and scientific progress. Through the wall of secrecy that black box practitioners erect, they force themselves into an unhealthy technical isolation. Honest inventors should be encouraged to protect themselves through patents or confidentiality agreements rather than excess mystery.

End

Scientific Anathema: The Black Box

Introduction: the scientific method

Reasons for the continued existence of black box techniques

Case study example: imaging / resolution technology

Test Setup

Was this test scientific?

Conclusions

About the Author(s)

References

Appendices

Editors' Picks

Errors and Omissions

Introduction: the scientific method

Reasons for the continued existence of black box techniques

Case study example: imaging / resolution technology

Test Setup

Was this test scientific?

Conclusions

About the Author(s)

References

Appendices

Join the Conversation

Share This Article

Related Reading

Editors' Picks

Errors and Omissions