In my final year of engineering undergrad, I worked under Dr. Moriba Jah to build out an automated orbital debris detection system using machine learning and open-source data. I made a ton of mistakes and didn’t get as far as I wanted, but I did get to learn from some truly spectacular researchers.
Orbital Debris
There has been a considerable increase in the number of things humans have lofted into space, specifically low-Earth orbit (LEO) in the past ~6 years. In 2016, there were about 6,000 satellites that had ever been launched into space, according to the United Nations Office for Outer Space Affairs [1]. Of that, there were about 4,000 satellites orbiting the planet, many of them launched before the turn of the century by the USA and USSR between 1957 and 1999. At the time of writing this post (EOY 2022), there are about 8,500 satellites in orbit, and more than 12,000 satellites total have been thrown into space by humans.
Earth is a shared resource, and LEO is no different. The exponential nature of the growth in LEO use via satellites is not a harmless thing and could result in some nasty outcomes, namely Kessler Syndrome. When objects in orbit break up, the debris that is created often does not immediately deorbit and depending on the altitude, it can remain floating in the sky for eons. Except it doesn’t float – it careens around the planet at thousands of miles per hour with ludicrous amounts of kinetic energy. The danger is obvious, but what isn’t is the positive feedback loop of destruction and debris that feeds on the now numerous satellites in orbit: Kessler Syndrome [2]. Even without the syndrome, space debris is still a dangerous and rather expensive threat – assets worth billions of dollars in space have to be monitored increasingly carefully to guard against collisions and it doesn’t always work.
Space Environmentalism
Dr. Moriba Jah, who was recently named a MacArthur Fellow for his work on the subject, is an astrodynamicist and prominent researcher on the subject of orbital debris. He and his colleagues have developed a platform for aggregating various data sources into a complete catalogue of space objects, ASTRIAGraph, as well as visualization tools like Wayfinder, available for free use in the public domain. In particular, the novel statistical methods employed in these tools highlight the discrepancies prominent between multiple data sources: data on space objects is often anything but tidy.
Researchers like Jah, including himself, have been advocating for “space environmentalism” for decades. That is, treating near-Earth orbit as a finite resource that should be governed by norms of behavior, accessible data sharing, and policies & regulations to service sustainable and equitable international use of space. Just like most natural resources, it is endangered by carefree use by the rich and the few in spacefaring nations, which is indeed everyone’s problem.
The Project
The system I worked to develop in Jah’s lab is known as Re-entry Analyses from Serendipitous Radar Data (RASR). As the frequency of satellite launches into LEO and other orbits has significantly increased, the risk of catastrophic collision with unknown debris and consequences has increased in lockstep. Tracking of orbital debris requires high-level data fusion and many heterogeneous data sources. RASR is an attempt to create a reliable, automated detection system for meteors and anthropogenic objects in space from readily available atmospheric data sets. Work done by the Astromaterials Research and Exploration Science (ARES) department of NASA Johnson Space Center explored the feasibility of this concept to bolster the number of meteorite re-entry detections made, which currently rely on eyewitness reports and may cover roughly 0.3% of total re-entries. As demonstrated by researchers, Level II NEXRAD data products (as in, next generation doppler radar systems) can be leveraged for low-atmospheric (<10 km) meteorite trajectory determination and effectively aid in the process of meteorite recovery.

Terminal trajectory from radar data. Note that this data collection process remains rather happenstance, essentially augmenting the already sparse eyewitness reports.
The RASR project expands on the proof-of-concept work of ARES and leverages its small database of detections to extend an automated detection system into the real-time domain. RASR shifts through real-time atmospheric data and detects phenomena, outputting a confidence value in detection, based on fall velocity signatures and the detection neural network architecture. Specifically, my work was to bolster the scale at which RASR operates as well as increase the precision (and provide a method to measure it!) of the binary classification results.
RASR isn’t meant to save billions of dollars per se, but it fills a glaring gap in the growing orbital debris industry: there is a lot of uncertainty concerning each piece of debris such that we often cannot predict when and where a given piece will re-enter. Even if we did, we don’t even have a system to track these events that scales. If an infrastructure like RASR is catching these re-entry events daily with very little overhead, that is a lot of very useful and novel data that can be backpropagated in time to when the debris was in orbit. Crucially, this allows us to constantly interrogate the knowledge we believe we have of debris in orbit and sound the alarms when we were wrong. If a piece of debris falls, that data can tell us precisely how good of a job we were doing at tracking it, if we were even tracking it at all.
Data Sources
The Weather Surveillance Radar-1988 Doppler (WSR-88D, or NEXRAD) is a pulse Doppler weather radar deployed throughout the United States and geographic regions globally to indirectly monitor meteorological and hydrological phenomena. There are 149 operational radars deployed throughout the continental U.S and specific regions globally. Each NEXRAD radar continuously scans a region of the sky in one of several modes (VCPs, covered below) and produces fields of three discrete parameters: radar reflectivity factor, mean Doppler Radial Velocity, and Spectrum Width of the Doppler velocity spectrum. At the highest temporal and spatial resolution publicly available, these data represent the fully three-dimensional Level II data as a time series which is archived on the NCDC database. In 2010, the network saw the deployment of Dual Polarization capability, operational as of May 2011. Adding vertical polarization to the legacy horizontal radar waves, additional data streams of differential reflectivity, correlation coefficient, and differential phase shift have been made available in Level II data streams. Each NEXRAD operates by sending out 0.5° wide radio pulses while sweeping out a 360° pattern at a given elevation, then listening for the returning signals – this elevation is adjusted upon completion of a sweep and the process is repeated. This is known as the Volume Coverage Pattern (VCP), whose characteristics vary in frequency and fidelity depending on the “mode” of the NEXRAD radar.

VCP 31 & 32, the typical path of a radar beam in “clear air mode”
In the above graphic, you can think of a single band of color as a cross section of a full 360° sweep done in the shape of the edge of a funnel over a short time interval (a minute or so). The radar then stops, tilts upwards or downwards, and repeats the next sweep.
Importantly, regardless of the mode, processed data in the form of Level II data products is published directly to the public domain and can be obtained using RASR scripts, or any website scraping tool for that matter. After the upgrades in 2011 to Dual-Pol, the data sets recorded are essentially 7-dimensional: all of the parameters listed above and the time domain, since the radar sweeps out a slice of atmosphere in finite time and moves on to the next radar elevation degree.
What is a Detection?
Using the NASA ARES database, a prototype detector can be built to visualize radar data and the network trained on existing labelled detections. There are about ~30 such labelled sweeps from re-entry events that have been painstakingly catalogued over the last 20 years.

Example detection visualized from Doppler Radial Velocity data, shown with bounding box and confidence.
In the visualized data above, a particularly distinct re-entry is shown in the velocity data measured by the NEXRAD system. This visualization is a flattened representation of the 3D sweep. The color represents the velocity of the matter detected relative to the radar station. The re-entry plume stands out against the gradient of the regular weather – there is a long, turbulent tail of hot particles and debris extending from the object falling from the sky.
One of the primary constraints of this work is that there are very few labeled re-entry events found in the trove of radar data. There were however, 6 additional dimensions of measurement in the radar data that RASR was not paying any mind to.
My Work
When I joined the project, key RASR functionality for retrieving atmospheric data and a simple convolutional neural network for re-entry phenomena had been developed by Miller, Keh, and Sarda. Building on top of this, the goal of my work was threefold:
- Develop an automation, testing, and storage pipeline
- Develop cross validation techniques and integrate into pipeline
- Develop improvements on the existing RASR neural network model
Automation
I used the supercomputer at UT Austin, Texas Advanced Computer Center (TACC), to provide compute nodes to run the automated RASR scripts to scan NEXRAD data daily. The focus here was useability, best for establishing RASR at scale. My idea was to use TACC to brute-force create a dataset of potential re-entry detections based on the inaccurate, but workable, RASR scripts that already existed. This dataset could then be treated as a pre-labeled dataset and the model could be fine-tuned from there. I created some data pipelines to handle the daily influxes of gigabytes worth of data, but I was never able to smoothly go from the massive set of potentially false detections that I was generating to anything useable because there really was no good way to correlate the detections with true positives. At least none that I prioritized – I didn’t get much further than those automated scripts.
Cross Validation & Some Metrics
Cross validation is just a machine-learning term for evaluating the performance of a classifier, done by methodically splitting a dataset up into portions to ensure that good (or bad) results aren’t a fluke of the particular group of data used to train the model.
As I mentioned, there are 6 parameters recorded by NEXRAD systems and only one of them was currently being used by RASR. The velocity field had been used in Fries report, but I wondered how the model would be affected if some combination of the parameters was used. After I had some cross validation pipelines setup, retooled the RASR scripts to work with different types of data, and created new datasets of training and validation data for the model, I found that incorporating Spectrum Width, Reflectivity, and cross correlation (a metric of the aspect ratio of the particles scanned) with the existing radial velocity resulted in a model that gave much better results with almost no architecture change. As a plus, I quadrupled the size of my dataset, with very little effort.


To the left, the confusion matrix of RASR running on only velocity data. To the right, the confusion matrix of RASR running a combined 4-parameter dataset. It doesn’t get any worse at true positives, and it gets much better at true negatives. A “fall” is a re-entry.
Precision recall (PR) curves are an informative metric for the classification skill of models, particularly those dealing with highly skewed datasets like mine (skewed meaning waaay more negatives, as in no re-entries, than positives). PR is the relationship between sensitivity (recall) and positive predictive value (precision), defined by the equations below.


Basically, if a model’s recall is 1.0, it is correctly capturing every actual positive instance in the data. If a model’s precision is 1.0, everything it’s reporting is positive actually is so.

Here’s a precision recall curve, it shows that even though the model is pretty good at finding all the re-entries in the data, it captures about 20x false positives. This is one of the reasons why my pre-labeling idea in the automation section was so tough.
Model Improvement
The primary goal of my project was to utilize the full spatial dimensionality of the NEXRAD data to detect re-entry phenomena. Because the radar sweeps out an area in 3D over time, I needed a model that could classify a time series of images. The model, a convolutional recurrent neural network, combines the feature extraction mechanisms of CNN that are useful for image classification (as currently used in RASR) and the incorporation of previous information done by recurrent neural networks (RNNs).
Before I could train the CRNN, I needed to create a new dataset of radial velocity data in 3D that would be passed into the network [5].

Here is a sample of the 3D time series of radar data, the re-entry is the larger blob towards the bottom. It appears 2D because every frame is a lower altitude slice in the VCP.
After some testing and initial training on the supercomputer, I found that my CRNN was producing good results, with precision much higher than the original CNN, but I did not have enough training data to perform cross validation so this could very well be a fluke. I proposed physics constraints be incorporated into future iterations of the model, as the re-entry phenomena would be falling through the sky in a deterministic enough trajectory that the model could check for to ensure it was correctly finding re-entries

My architecture at the close of the project. Pink is for proposed features.
Where Debris is Headed
Though I wish I could say that I made huge strides forward for the RASR tool, it was more like picking a direction towards which to work on steady improvements. It was extremely rewarding work, however, and Dr. Jah is without a doubt the academic I admire the most.
After working on this tool, I found myself increasingly interested in the potential for systems that work to mitigate the formation of orbital debris in the first place. This can take the form of the heavy-duty number crunching that companies offering Space Situational Awareness Software as a Service (SSASAAS lol) like Slingshot or Privateer, or it can be efforts to develop space tugs or sophisticated orbital robotic platforms for satellite servicing [3]. Currently, debris is officially tracked by the U.S. Space Command’s 18th Space Defense Squadron. It consists of about 100 people tracking objects that have doubled to ~30,000 in the past six years. For a bit on anchoring on the issue, the International Space Station (ISS) is a $150bn manned outpost in space that has had to maneuver to avoid such debris several times in the past year, mostly due to a recent Russian anti-satellite test [4]. If a large chunk of debris were to slam into the ISS, if it weren’t completely catastrophic resulting in loss of life, at the very least one of the 16 modules would need to be replaced. At ~$4k/kg, that’s roughly $250mm just to launch – not exactly chump change, even on NASA’s dime.
The lock-step proliferation of debris and commercial interest in space has created a market for mitigating the risk of collision, ruining extremely expensive equipment. Commercial operators and researchers in the public domain are stepping up were those 100 Space Command workers simply cannot, and it’ll be tough to lose money preventing collisions: upper bound estimates for satellites in orbit by 2030 are on the order of 100,000.
Data is Messy
As a final thought, I wanted to highlight the immensity of the data to be gathered and the titanic statistical challenge that is hacking through it. Data on space objects is collected across the globe using large radar facilities of varying resolutions at frequencies on the order of GHz (usually X-band). There are lots of difficulties to getting accurate information on space object position and velocity data, including atmospheric interference, the speed of the objects distorting the reading (inverse synthetic-aperture), and distortion due to rotation of the object. Due to incomplete information, keeping track of debris in space is a wicked problem.
To predict collisions (or conjunctions, as they are called in industry), 30,000 objects with position and velocity vectors must be assessed on a sliding time window that can range from minutes to 10 days. Different mathematical methods for conjunction assessments are used, the most simple and popular being the “two-dimensional” probability of collision (Pc) computation put forward in the 90s where the closest distance between two objects is found from an assumed gaussian distribution of the position vectors and the objects are propagated numerically forward in time on Keplerian orbits. More modern approaches, like Coppola’s method, are superior in that they account for uncertainties in velocity over time as well.
Just to drive it home, if I have 30,000 objects that I want to propagate forward in time just 2 days, that is on the order of 10 billion equations to be solved. Then I must account for errors and propagate uncertainties, which even under simple frameworks is 20x more. Finally, there is the issue of rooting through the results as fast as possible to sort out the potential collisions – about 50 exaFLOPS (50e^18) worth of computation. Definitely do-able, but absolutely no fun if you’re constantly dealing with edge cases and incomplete data. And certainly not when billions of dollars are on the line.
Augmentation
[1] Of course, no one really agrees on this number, but it’s certainly not off by more than 10%
[2] For more detail, here is a great write up by Casey Handmer on the subject of space debris
[3] A satellite that is broken is often “non-cooperative” and cannot perform orbital maneuvers that would allow it to miss collisions with debris or even other satellites, in the worst case.
[4] The Russians blew up an old satellite of theirs with a Nudol missile in 2021, sparking international outrage and an international agreement to ban such a test is currently in progress. The U.S., India, & China are the other countries with these tests under their belts.
[5] I only used radial velocity data as opposed to combined data because I was running out of time and that would have 4xed the effort. Alone, it gives decent results in 2D.
Leave a Reply