Embedding the Sky

February 27, 2026

Two projects at the intersection of geospatial data and AI. First, a semantic search engine for aerial imagery, fine-tuned on OpenStreetMap-tagged data. Second, WARP—a custom neural model that learns embeddings of aircraft flight trajectories, trained on roughly a million aircraft tracks. Given any flight path, WARP can instantly retrieve the most similar trajectories from history. It enables applications from OSINT and anomaly detection to safety analysis and flight planning. The same latent structure that made word2vec transformative for language now becomes available for aircraft behavior.

Presented at Spec.LA on December 18, 2025.

	I'm going to quickly show two things tonight. The past few years I've been working mostly in geospatial, aviation, and AI, and that's what these are. One is geospatial semantic search and the other is a foundation model for aircraft. I believe they're both extremely valuable and useful. #
	Both projects are about embeddings. Embeddings are points inside a latent or conceptual space. Parts of that space that are conceptually similar are closer together. There are neural models that will generate an embedding of an image or text or other things. With the right model, an embedding of a photo of a golden retriever is close to the embedding of the text "golden retriever" and far from the embedding of an image of an aircraft boneyard. #
	I showed an early version of PIMINTO at Spec a long time ago. I've refined it a lot since then. It uses a model that's fine tuned on satellite imagery and text describing the imagery based on OpenStreetMap tags. I'm going to give a live demo of the new site, which is public. piminto.obliscence.com \| Demo video #
	Next is WARP, which is a model I'm building to create embeddings of aircraft paths. Back to aircraft, big surprise. #
	I trained the model on about a million aircraft tracks, pretty much everything that flew in the western U.S. over the course of a week. #
	At this point training takes 18 hours on a 4090 GPU. #
	I've come up with a custom architecture for this task. #
	I kept feeding the single file into GPT o3 or GPT 5, either asking it to critique the code or describing specific problems I was running into. Sometimes I'd paste a CSV of the loss curve. It would give me detailed descriptions of what to do next, and I'd paste those into Cline with Sonnet 4.5 doing the actual coding. #
	How does it do? #
	I hoped that if I took a few thousand trajectories and clustered their embeddings I would see clear separation and similarities. But that didn't really happen. A few extreme examples looked good, like skydiving aircraft being outliers, but otherwise I couldn't really make sense of it. But then I read someone who said sometimes clustering just looks nonsensical even if the embeddings are good at capturing similarity. #
	So I wrote an interface to let me compare any of the million trajectories. I can select one trajectory and it will almost instantly show me the other most similar trajectories in the database. In this screenshot the blue is the query trajectory and the red is the most similar trajectory it could find. This one isn't too bad. The altitude and speed profile are very similar but not exactly the same. Promising. #
	These are the signatures of touch and go practice. These two are a good match. #
	These are two different skydiving aircraft. They climb up to altitude, everyone jumps out, then they race back to the field. #
	These are two different flights by the same aircraft. But what does it really look like? #
	This is pretty cool. This looks like two training flights, by the same aircraft but on different days. Each time following the same route and including kind of a loop and then some tight circles. #
	These are two different Thales Watchkeeper drones, flying on different days over Fort Bliss. As a reminder, I chose the track on the left and then the system searched the database in a fraction of a second and determined that the track on the right was most similar to it. #
	This starts to hint at some potential uses. Both of these helicopters are doing some sort of utility work in forest. Hauling something maybe. Again, the track on the left is the query track, and the one on the right is the best match. #
	These are two different aircraft on two different days, dropping sterile fruit flies on different parts of Los Angeles. There are many other aircraft flying these sorts of lawnmower patterns, for example doing LIDAR collection, but it determined the one on the right is the most similar. #
	A useful aircraft embedding model unlocks a lot of interesting and valuable use cases, similar to how word2vec and text embeddings unlocked a whole class of semantic search, clustering, retrieval, anomaly detection, and analogy-style reasoning over language—except now the same kind of latent structure becomes available for aircraft behavior, trajectories, and operational patterns. #
	#

John Wiseman lemonodor.bsky.social