The Mathematics of Drilling Intercepts
Introduction
Drilling intercepts are a prominent feature of junior mining news releases. They are closely monitored by the mining investment community, and a particularly good intercept can raise the prospects for a project.
As an example, consider this November 10 2020 release from Freegold Ventures:
Freegold Intercepts 3.78 g/t Au Over 119 Metres Including 131.5 g/t Over 3 Metres Within 573 Metres of 1.21 g/t Au at Golden Summit
The market responded with a 3% boost in the share price the next trading day, so clearly this was regarded as a positive signal for the company's prospects. (This is typical:capital markets tend to treat any news of this sort as good news.) The implications for the economic, geological, and engineering variables surrounding the project are much less clear. Is this a good geological result? Is it a good engineering result? Intercepts are highlights: incomplete data, collected and released selectively, so is it even possible to make an informed judgement using these numbers?
To complicate things even further, the selectively reported drilling intercepts are usually released in a rather complex manner, which can make it difficult to distinguish between truly good numbers and deceptively good results. Drilling intercepts are discussed at great length in other sources (here and here and here) but we'll take a mathematical perspective and develop a model that describes nested intercept configurations of arbitrary complexity.
We'll take Great Bear Resources for an extended example. Great Bear Resources is a Canadian junior mining company whose stock gained substantially on announcement of very high grade intercepts at their Dixie project in Ontario. At time of writing, GBR is trading at a $886 million CAD market cap (which is not very bearish at all!)
Here we open up the spreadsheet of drilling results (available on their website), and then filter on Drill Hole
to consider a single hole:
import pandas as pd
intercepts = pd.read_excel('drive/My Drive/Projects/posts/data/Great_Bear/lp_drill_hole_composites_all.xlsx')
intercepts['Record'] = intercepts.index
dh = intercepts[intercepts['Drill Hole'] == 'BR-022']
dh
This is how intercepts are typically presented: a table with a From
field describing where they started measuring, a To
field describing where they stopped, and a Grade
field (called Gold
here) that tells us how enriched that interval is with the valuable stuff. From
and To
are typically measured downhole from the drill collar.
It's easy to establish a basic understanding of how these tables are read, and many experienced mining investors immediately recognize these grades as very high. The rest of us might need to rely on statistics, since we don't have the benefit of many years' experience with drilling results.
Of course it is first necessary to determine the true assay values for each separate interval from top to bottom. Unfortunately, each row is not independent - some of the intercepts are contained in others, and the subinterval gold is INCLUDED in the parent interval calculation! So we can't just use the Gold (g/t)
field directly, since intercepts are reported with these "highlights", or higher grade sections within the longer interval.
Sometimes this convention is used unethically to suggest larger intervals of enrichment than truly exist. This is called "grade smearing" and the method of residual grade calculation applied here will detect any such attempt to disguise poor results.
At first it may seem like the correct interpretation of these intervals is to imagine them intervals stacked on top of one another, but this is very misleading. We can easily visualize this to see the error:
- We only have the total grade, INCLUDING the high-grade, child subintervals. Considering it in that way ignores the fact that the high-grade intervals are included in the wider, lower-grade intervals, inflating the grade measured over that length. This has enormous implications for the continuity of the mineralization, which determines the feasibility of the project.
In order to eliminate this effect we'll need to do some math with the intercepts. This visualization attempts to show this hierarchical, branching structure:
Plotted side by side, the intercepts show the parent-child overlapping relationship and capture the complexity of the problem.
Parent intervals can have no child intervals, a single child interval, or several child intervals. Child intervals themselves can have no child intervals, a single child interval, or several child intervals. Clearly there is a whole class of related problems we could solve with a general solution to this problem.
So far we have treated the From
and To
fields in isolation, and we can use a cool feature of Pandas to convert them to intervals:
dh['Interval'] = dh.apply(lambda x: pd.Interval(x['From (m)'], x['To (m)']), axis=1)
dh
So the motivation here was to create Interval
objects to use them with the pd.Overlaps
function and then model the overlap relationship among the different intervals:
import itertools
import numpy as np
cross_interval = itertools.product(dh.Interval,dh.Interval)
overlap_matrix = np.array([interval[0].overlaps(interval[1]) for interval in cross_interval])
intersect_matrix = np.array([interval[0].intersect(interval[1]) for interval in cross_interval])
ns = int(np.sqrt(overlap_matrix.shape[0]))
Here we see the overlaps: if a pixel is white, it means that the interval on the x-axis and the interval on the y-axis overlap.
Overlap is symmetric: so each 'child' overlaps with its parent and vice versa. It should become clear that we are actually interested in the "contains" relationship, which is not symmetric and will help us identify parent intervals and child intervals and start reducing the intervals.
Fortunately this is also supported in Python:
from sympy import Interval
dh['Interval_obj'] = dh.apply(lambda x: Interval(x['From (m)'], x['To (m)']), axis=1)
cross_interval = itertools.product(dh.Interval_obj,dh.Interval_obj)
contain_matrix = np.array([interval[0].is_proper_superset(interval[1]) for interval in cross_interval])
Now we can pull out a tree
Of the machine-intelligible formats, a tree data structure is clearly the most suited to representing the intervals.
for i, col in enumerate(contain_matrix_sq.T):
if ~np.any(col):
all_str_to_node[str(dh['Record'].values[i])].parent = root
else:
all_str_to_node[str(dh['Record'].values[i])].parent = all_str_to_node[str(dh['Record'].values[::-1][np.argmax(col[::-1])])]
print(RenderTree(root, style=AsciiStyle()).by_attr())
Now we are really getting somewhere- we can actually start looking at the global picture (since we now know which intervals are not "child" intervals)
These are the direct children. We can go ahead and plot them and have a totally accurate picture of the log:
dh_prime.dtypes
While that is correct, it is not complete: we have left out all of the additional information provided by the smaller sub-intervals!
In order to incorporate that we will have to remove them from the parent intervals and determine the residual grade (whatever is left once we pull out the gold contained in the subinterval)
((119) * (3.78) - (3) * (131.5)) / (119 - 3)
As an example of this kind of calculation, a simpler set of intervals from a Freegold Ventures press release:
Freegold Intercepts 3.78 g/t Au Over 119 Metres Including 131.5 g/t Over 3 Metres Within 573 Metres of 1.21 g/t Au at Golden Summit
We know the gold grade over the whole 119 meters, and the gold grade over 3 meters, but what is the gold grade over the $119 - 3 = 116 m$?
The solution is a simple weighted average calculation, like compositing over a drillhole:$\frac{119 \times 3.78-3 \times 131.5}{119-3} = 0.477 g/t$
Credit to https://twitter.com/BrentCo77759016/status/1326183861722599424 and
So now we have to do this, but with every single subinterval until we get the residual grade at every point along the drillhole
Fortunately, the tree data structure we selected has specialized methods that make a traversal very simple.
levelord_nodes = [(node.name, node.children) for node in LevelOrderIter(root)]
levelord_nodes
nn_np_loi = [(node.name, node.parent) for node in LevelOrderIter(root)]
all_str_to_node
for node, parent in nn_np_loi[::-1][:-1]:
print(node)
for child in all_str_to_node[node].children:
print(child)
dh
cross_interval = itertools.product(dh.Interval_obj,dh.Interval_obj)
intersect_matrix = np.array([interval[0].intersect(interval[1]) for interval in cross_interval])
intersect_matrix
dh['grade_len'] = dh['Gold (g/t)'] * dh['Width (m)']
dh
from sympy import Union
import functools
resid_grades, resid_len = {}, {}
for node in levelord_nodes[1:]:
parent_interval = dh[dh['Record'] == float(node[0])]
child_names = [child.name for child in node[1]]
child_intervals = [dh[dh['Record'] == float(child)] for child in child_names]
new_interval_obj = parent_interval.Interval_obj.values[0] - Union([child.Interval_obj.values[0] for child in child_intervals])
l_child_int = Union([intv['Interval_obj'].values[0] for intv in child_intervals])._measure
lg_child_int = [dh.loc[int(child_name)]['grade_len'] for child_name in child_names]
lg_total_int = parent_interval.grade_len.values[0]
residual_grade = (lg_total_int - sum(lg_child_int)) / (new_interval_obj._measure)
resid_grades[node[0]] = residual_grade
resid_len[node[0]] = new_interval_obj
print("Interval:")
print(node[0])
print("Length x Grade:")
print(lg_total_int - sum(lg_child_int))
print("Residual Grade:")
print(residual_grade)
Check these solutions: 95 should be easy to validate
includes 96
95 extends from 110.00 m to 116.10 m (l = 6.10 m)
96 is the interval from 111.40 to 113.10 m (l = 1.70 m)
Larger interval
(6.10 m)(2.62 g/t) = (1.7 m)(7.92 g/t) + (4.4 m)(x g/t)
Rearranging terms, we get:
$x = \frac{(6.10 m)(2.62 g/t) - (1.7 m)(7.92 g/t)}{ 4.4 m }$
So the residual grade is 0.5723 g/t, which matches the value found above!
((6.10 * 2.62) - (1.7)*(7.92)) / (4.4)
resid_len
Interval(445.000000000000, 452.000000000000).end
dh['Record'].astype(str).map(resid_len)
dh['Record'].astype(str).map(resid_grades)
TODO: Need to split up the non-contiguous segments so that you can actually plot them
Idea: use .args attribute Problem: This is defined for Interval objects as well and it gets us something we don't want
dh['Record'].astype(str).map(resid_len)
def args_extended(interval):
if type(interval) == Union:
return interval.args
else:
return interval
remaining_interval_grade = pd.DataFrame({'interval' : dh['Record'].astype(str).map(resid_len), 'grade' : dh['Record'].astype(str).map(resid_grades)})
remaining_interval_grade['split_intervals'] = remaining_interval_grade.interval.apply(args_extended)
rig_exploded = remaining_interval_grade.explode('split_intervals')
rig_exploded
rig_exploded['From'] = rig_exploded.split_intervals.apply(lambda x: x.start).astype(float)
rig_exploded['To'] = rig_exploded.split_intervals.apply(lambda x: x.end).astype(float)
rig_exploded['Width'] = (rig_exploded.To - rig_exploded.From).astype(float)
rig_exploded['grade'] = rig_exploded.grade.astype(float)
rig_exploded['drillhole'] = 'BR-022'
rig_exploded[['From', 'To', 'grade', 'Width']]
y_axis = alt.Axis(
title='Intercept ID',
offset=5,
ticks=False,
domain=False
)
reqd_cols = ['From', 'To', 'grade', 'Width', 'drillhole']
alt.Chart(rig_exploded[reqd_cols]).mark_bar().encode(
alt.X('From:Q',
scale=alt.Scale(zero=False)),
x2='To:Q',
y=alt.Y('drillhole:N', axis=y_axis),
color=alt.Color('grade:Q', scale=alt.Scale(scheme="inferno")),
tooltip=[
alt.Tooltip('Width:Q', title='Width'),
alt.Tooltip('grade:Q', title='Gold Grade')
]
).properties(width=800, height=100).configure(background='#D9E9F0').interactive()