Difference between revisions of "Satellite image data concepts"

From Wonkpedia
Jump to navigation Jump to search
(Sensor modalities (part 1))
(Rounding out discussion of optical)
Line 158: Line 158:
   
 
The point to remember is that most satellite imagery with good spatial resolution is pansharpened, and this creates some artifacts. In particular, when you are zoomed all the way in to 100% (pixel-for-pixel screen resolution), you have actually overzoomed all the color or multispectral information. Any pansharpening algorithm can only estimate a likely distribution of color. It’s like superresolution with neural networks – it may be statistically likely to be correct, it may be perfect in some cases, it may help you interpret what’s there, but it is necessarily a process of inventing information. And that entails risks.
 
The point to remember is that most satellite imagery with good spatial resolution is pansharpened, and this creates some artifacts. In particular, when you are zoomed all the way in to 100% (pixel-for-pixel screen resolution), you have actually overzoomed all the color or multispectral information. Any pansharpening algorithm can only estimate a likely distribution of color. It’s like superresolution with neural networks – it may be statistically likely to be correct, it may be perfect in some cases, it may help you interpret what’s there, but it is necessarily a process of inventing information. And that entails risks.
  +
  +
==== Georeferencing and orthorectification ====
  +
  +
''Much of this applies outside optical as well – move?''
  +
  +
A raw satellite image of land is an angled view of a rough surface. (Even nominally nadir-pointing satellites acquires imagery that is off-nadir toward its edges.) If you imagine riding on a satellite and looking off to, say, the west, you will see the eastern sides of hills and buildings at flatter angles than you see the western sides – if you can see them at all. To turn a raw image into something that is projected orthographically, like a map, you have to use a terrain model – a 3D map of the planet’s surface. Then you can use information about where the satellite was and the angle its sensor was pointing, and for each pixel in the output image, you can project it out to see at what latitude and longitude it must have intersected the ground. Then you move all the pixels to their coordinates in some convenient projection, and you’ve essentially taken the image out of perspective and made it orthographic.
  +
  +
Except:
  +
  +
* Earth’s surface is rough at every scale, and even “porous” or multiply defined in the sense that there are features like leafless trees that make it hard to define where the optical surface actually ''is'' at any given scale.
  +
* There is no perfectly [https://en.wikipedia.org/wiki/Accuracy_and_precision accurate, precice], global, completely up-to-date terrain model of the Earth, let alone at a reasonable price. SRTM is pretty good but it’s only about 30 m, stops short of the arctic, and is 20+ years out of date: there are entire lakes, highway cuts, and reclaimed islands that don’t exist in it.
  +
* Satellites typically only know where they’re pointing to within the equivalent of about 10 pixels (which, to be fair, is usually an extremely small fraction of a degree), so the pointing data can only narrow things down, not actually tell you where you are.
  +
* Continental drift means that a continent can move by easily 1 px over the lifetime of a high-end commercial satellite; a major earthquake can discontinuously distort a small region by several m.
  +
* To properly pin down an image (i.e., to check the reported pointing angle), you need to know the exact 3D location of 3 visible points within it, and realistically more like 10.
  +
* All these errors can combine.
  +
* No matter what, you can’t recover occluded features, i.e. things you can’t see in the original data. If you want a high-res satellite image of something like a canyon, you realistically need half a dozen images at very specific angles, which is extremely hard.
  +
  +
We could go on! Georeferencing and orthorectification is a difficult problem. It’s easier for lower-resolution satellites, because a given angular error comes out to fewer pixels. Also, survey-mode satellites like Landsat and Sentinel-2, which are nadir-pointing anyway, put a lot of effort into doing this well. Two Landsat scenes will almost always coregister to well within a pixel. Sentinel-2 is a little less reliable, especially toward the poles. Commercial imagery is often displaced by far more than you would think. One way to see this is to step back in Google Earth Pro’s history tool, especially somewhere relatively remote and rugged.
  +
  +
Here’s a farm in Nepal: 28.553, 84.2415. Just step back in time and watch it jump around underneath the pin. If you really want to be scared, watch the cliff to its north. This is why imagery analysts who understand imagery pipelines rarely use a whole lot of significant digits in their coordinates! You don’t really know where anything on Earth is, in absolute terms, to within more than a few meters at best if all you have to go on is a satellite image.
  +
  +
==== Atmospheric correction ====
  +
  +
Over long distances, even in clear weather, the atmosphere scatters and absorbs light. This is why distant hills are low-contrast and blueish (blue light is scattered more). What a satellite actually measures is called top-of-atmosphere radiance, or TOA. This is a measurement of nothing more than the amount of energy received per second, per pixel, per band. It can be measured pretty objectively. However, it’s often not what you want. For one thing, it’s too blue. For another, the amount of blueness and related effects will vary semi-randomly with atmospheric conditions (humidity, maybe dust storms or wildfire smoke, etc.) and predictably with season.
  +
  +
Therefore, a reasonable desire is to basically normalize the sun and remove the effects of the atmosphere. What we’re trying to model here is called surface reflectance (SR). The main issue is that we don’t know the true state of the atmosphere at the moment the image was acquired. The best we can do is to model it and subtract it out. This is one of ''the'' problems in remote sensing, and you could earn a PhD by improving [https://en.wikipedia.org/wiki/Atmospheric_radiative_transfer_codes#Table_of_models one of the major models] by a few percent.
  +
  +
The good news is there’s a brutally simple method that works pretty well most of the time. Dark object subtraction means assuming that the darkest pixel in the image should be pure black. Therefore, if you subtract out however much blue (and green, and so on) signal is present in the darkest pixel, you will have canceled out all the haze. It’s annoying how well this works considering how basic it is. It’s roughly equivalent to the auto-adjust tool in an image editor like Photoshop, or, to be a little more exact, a little like using the eyedropper in the Levels tool to set the black point to the darkest pixel.
  +
  +
Correction to reflectance may or may not attempt to correct for terrain effects (i.e., relighting the scene). Different pipelines have different conventions for how far to correct or what to call different kinds of correction.
  +
  +
Atmospheric correction is usually not key for OSINT purposes, but any time you find yourself taking exact measurements of pixel values, you should at least know whether you’re working in TOA or in SR, and if SR, you should have a sense of what the pipeline was.
  +
  +
==== Common optical sensor types ====
  +
  +
''This section is a stub. Please start it!''
  +
  +
# Pushbroom
  +
# Whiskbroom
  +
# Full-frame
  +
  +
=== Thermal ===
  +
  +
''This section is a stub. Please start it!''

Revision as of 21:55, 20 July 2022

This page provides an organized list of ideas useful for understanding image data from satellites. It is intended for people with some background or practical knowledge who want to fill in the gaps. Since many concepts are intrinsically cross-cutting, they can’t be forced into a single perfectly hierarchical taxonomy; the goal is merely to keep related ideas reasonably near each other.

We might divide up the kinds of knowledge it’s useful to have when working with satellite data like this:

Layers of abstraction in remote sensing knowledge
Practice This page Theory
Learning how to answer questions by actually using data in Photoshop, QGIS, numpy, etc. Learning technical vocabulary and concepts that apply across sources Learning rigorously defined principles based in physics, geostatistics, etc.

All of these kinds of knowledge are important to an OSINT practitioner. This page only covers the middle range – ideas that are more abstract than what you can learn from the pixels themselves, but less abstract than what you would get in a higher-level college course.

Within those bounds, the organizational arc here is broadly from the more abstract (orbits) through the relatively concrete (how sensors work) to the practical (what a geotiff is).

Orbits and pointing

As an example of a typical optical Earth observation orbit, let’s take Landsat 9’s parameters from Wikipedia:

  • Regime: Sun-synchronous orbit. This means the orbit is designed to always pass overhead at about the same local solar time. Put another way, any two Landsat 9 images of a given spot at a given time of year will have the same angle of sunlight on the surface, and the same angle between the surface and the sensor. Specifically, it always crosses the equator on its southbound half-orbit at 10:00 (and, therefore, on its northbound half-orbit at 22:00). This mid-morning window is the sweet spot for most optical imaging purposes. In most climates where cumulus clouds are common, they generally form around midday as the mixed layer rises. It’s also claimed that this is the heritage of cold war IMINT workers wanting shadows to estimate structure heights. (If you image around noon, you get places with vertical shadows in the tropics. This gives you depth perception problems, like you get walking though brush with a headlamp instead of a hand-held flashlight. Citation needed, though.) Virtually all commercial satellite imagery that you see on commercial maps has shadows that point west and away from the equator – in fact, as of 2022, this is so consistent that if you see a shadow pointing a different direction, it’s a good hint that the imagery is actually aerial (taken from a plane/UAV/balloon inside the atmosphere), not satellite.
  • Altitude: 705 km (438 mi). This is basically chosen to be as close to the surface as reasonably possible without grazing the atmosphere enough to perturb the orbit. It is substantially higher than the International Space Station, for example, but ISS has to constantly boost itself back up and that’s expensive. (ISS does occasionally underfly imaging satellites.) For comparison, if Earth were the size of a 30 cm (12 inch) desktop globe, Landsat 9’s orbit would be at 17 mm (2/3 inch) – grazing your knuckles if you held the globe like a basketball. (Developing some intuition about this relative size can help understand the practicalities of things like off-nadir imaging.)
  • Inclination: 98.2°. This is the angle at which the satellite crosses the equator. It makes the orbit slightly retrograde, which is part of the equation for staying sun-synchronous. A consequence is that although orbits like this one are sometimes called polar in a loose sense, they never exactly cross the pole – Landsat 9 always misses the south pole on its left and the north pole on its right. This leaves two relatively small polar gaps that are never imaged.
  • Period: 99.0 minutes. This is the time it takes to do one full orbit. This is another variable constrained by the requirements of syn-synchrony and the lowest reasonable altitude.
  • Repeat interval: 16 days. Every 16 days, Landsat 9 is in exactly the same spot relative to Earth (± very small deflections due to space weather, micrometeorites, tides, maneuvers to avoid debris, etc.) and takes an image that can be exactly co-registered with the previous cycle’s. Furthermore, pairs (or mini-constellations) like Landsat 8 and 9 or Sentinel-2A and 2B are in identical orbits but half-phased such that, from a data user’s perspective, they act like a single satellite with half the repeat time. (Specifically, 8 days for Landsat 8/9 and 5 days for Sentinel-2A/B.) More or less by definition, constellations are designed to fill in each other’s gaps; for example, the wide-swath, low-resolution MODIS instruments are on a pair of satellites with near-daily coverage, but one mid-morning and the other mid-afternoon.

We used Landsat 9 here because it’s familiar to most people in the industry and is well documented. Other imaging satellites will have different sets of capabilities and constraints. For example, the Landsat series is on-nadir (looking straight down) more than 99% of the time. It only rolls to the side to look away from its ground track for exceptional events, e.g., major volcanic eruptions. But a high-res commercial satellite, e.g., in the Airbus Pléiades or Maxar WorldView constellations, is constantly looking off-nadir. One of these satellites might point its optics in easily half a dozen directions on a given orbit, and would only very rarely happen to look straight down.

Commercial users typically want images that are on-nadir and settle for images less than about 30° off-nadir. Around that angle, atmospheric and terrain correction starts getting hard, tall things are seen from the side as well as from above and block whatever’s behind them (an effect called layover), and the practical utility of imagery falls off for most purposes. But the area within 30° of nadir is quite large: about 400 km or 250 mi wide, according to some light trig.

High-resolution commercial satellites schedule collections in a process called tasking (as in “Tokyo is tasked for tomorrow”). This is in contrast to the survey mode collection used by Landsat, Sentinel, etc., which are essentially always collecting when they’re over land.

Resolutions

Satellite instruments can be thought of as identifying features (a deliberately abstract term) in any of a number of dimensions. The dimension(s) we think of most often is spatial: x and y, or equivalently longitude and latitude or east and north, on Earth’s surface. But a sensor needs a nonzero amount of resolving power in the other dimensions as well in order to be useful.

The idea of resolving power has formal definitions in optics, for example, but here we will be informal and common-sensical about what it means to actually resolve something. In particular, resolution is usually defined in terms of points (in some dimension), but in the real world we only rarely care about points of any kind; we’re usually more interested in objects and patterns.

As an example, imagine we’re looking for a bright white napkin left on a freshly paved asphalt runway. Even if our data is at a resolution of, say, 25 cm, and the napkin is only 10 cm across, we will probably be able to find the napkin because the pixels it overlaps will be noticeably brighter, assuming good radiometric resolution. In this case, we’ve beaten the nominal spatial resolution of the sensor – we haven’t technically resolved the napkin, but we’ve found it, which is what we wanted.

On the other hand, imagine that there are F-16s on the runway, and we want to know whether they’re F-16As or F-16Cs. Unless we have outside information (about markings, say), it’s entirely possible that we can’t tell. The details we need simply aren’t clearly visible from above. Therefore, we cannot determine whether there are F-16As at this airfield – despite the fact that F-16As are much larger than the resolution of the sensor. This seems painfully obvious when spelled out, but people who should know better routinely make versions of this mistake when working on real questions.

These two examples with spatial resolution illustrate that you can’t think of resolution (of any kind) as simply the ability to see a thing of a given size. Sometimes you’ll have better data than you’d think from looking at the number alone and sometimes you’ll have worse. Be skeptical of blanket statements that you definitely can or can’t see x at resolution y. Often, it’s really a situation where you can see some % of xs at resolution y under conditions z, and it’s just a question of whether trying is worth the time.

Resolutions are in a multi-way tradeoff in sensor design. As one of several important factors, increasing each kind of resolution multiplies data volumes, and getting data from a satellite to the ground is expensive and sometimes physically limited. In a sense, you can’t get satellite data that does everything (is super sharp and hyperspectral and …) for the same reason you can’t get a blender that’s also a toaster and a dishwasher. The laws of physics might not preclude it, but the constraints of sensible engineering absolutely do. What you see in practice are satellites that push for some kinds of resolution at the expense of others. Knowing how to mix and match to answer a particular question is a valuable skill.

Spatial

If someone says “this is a high-resolution sensor” we understand this by default to mean spatial resolution. This is also called ground sample distance (GSD) or ground resolved distance (GRD), and is the dimensions of the pixels of the data. (Theoretically, you could oversample your data and have pixels smaller than what’s actually resolvable, but that’s not an urgent consideration here.) We usually assume that the pixels are square or close enough, so you see this given as a single length dimension: 50 cm, 15 m, etc.

There’s some sleight of hand with definitions here. If we think about standard optical instruments, which are basically telescopes with CCDs, they do not have an intrinsic ground sample distance. They have an intrinsic angular resolution – a fraction of the arc that each pixel covers. This only becomes a distance on Earth’s surface if we assume the sensor is pointed at Earth at a given distance and angle. The nominal resolutions of optical satellite instruments are given for the altitude of the satellite (which can change) and looking on nadir (straight down). That’s a best case. When looking to the side, at rough terrain, the pixels can cover larger areas, inconsistent areas from one part of the image to another, and areas that are not square. Some of these problems get better and others get worse after orthorectification (see below).

This is why it pays to be very cautious about measuring things based purely on pixel-counting, especially in imagery that’s been through some proprietary or undocumented processing pipeline. It’s more reliable to (1) have a very clear sense of what scale distortions are likely present in the image, and (2) reference measurements to objects of safely assumed dimensions.

An old-school IMINT way to measure what spatial resolution means in practice is the National Imagery Interpretability Rating Scale (NIIRS).

An often overlooked consideration on spatial resolution is that pixel area is the square of pixel side length, and it’s what matters most. (We’ll assume square pixels for this discussion.) If you consider a square meter of ground, you can envision it covered by exactly 1 pixel at 1 m GSD. At “twice” that GSD, 50 cm, it’s covered by 4 pixels – but 4 is not twice 1. At 25 cm GSD, which sounds like 4× the resolution, it’s covered by 16 pixels, which is far more than 4× as clear. Perceived sharpness, information in a technical sense, and (most importantly) the practical ability to interpret fine details goes up in proportion to pixel count, not as the inverse of pixel edge length. In other words, 10 m imagery is more than 3× as clear as 30 m imagery, all else being equal.

Spectral

Spectral resolution is the ability to distinguish different frequencies (wavelengths) of light or other energy. We often measure it as a number of bands, where bands are like the R, G, and B channels in everyday color imagery. Grayscale imagery has 1 band. RGB imagery has 3. RGB + near infrared (a common combination) has 4. Multispectral sensors on more advanced satellites often have about half a dozen to a dozen bands, typically covering the visible range and then parts of the near to moderate infrared spectrum.

We often measure into the infrared (IR) for three main reasons:

  1. Infrared light is scattered less than visible and especially blue light is by the atmosphere. This allows for more clarity and contrast – basically, better radiometric resolution (see below). Another way of saying this is that IR light cuts through haze.
  1. Healthy plants strongly reflect near infrared (NIR) light. If we could see only slightly deeper shades of red, we’d see trees and grass glowing hot pink. This means infrared is useful for vegetation monitoring (for example, with NDVI), which is useful for agriculture but also for anything that affects plants. You can use infrared to spot subtle tracks and traces on vegetation that might be invisible in ordinary imagery. (For example, you might be able to detect a road under a forest canopy by noting that a line of trees is thriving slightly less than last year.)
  1. Things that are camouflaged in visible light, deliberately or not, are often easily distinguishable in infrared. Specifically, green paint tends to absorb IR (unlike plants) and stand out like a sore thumb. Since everyone knows this now, sophisticated actors no longer assume that you can hide a tank (for example) by painting it green, but you can still find things in infrared that you wouldn’t have in visible. You see more stuff when you have more frequencies available.

For these reasons, and others as well, optical satellites have always been biased toward the IR side of the spectrum.

Many optical sensors have one spatially sharp band with low spectral resolution, typically covering the visible range and some infrared, and multiple bands that are spectrally sharp but spatially coarse. These will be called the panchromatic or pan and (collectively) multispectral bands. They are merged for visualization in a process called pansharpening (see below). Sentinel-2, for example, does not have a pan band, but it collects different bands at different spatial resolutions roughly in proportion to their assumed importance – visible and NIR are 10 m, some other IR bands are 20 m, and then there are some “bonus” atmospheric bands at only 60 m.

Sensors that focus specifically on spectral resolution (sometimes with hundreds of bands) are called hyperspectral.

Here we’ve used optical and infrared wavelengths as examples, but the basic principles are similar for, e.g., radio frequency bands. In general, for any kind of observation, multiple spectral bands help resolve ambiguities in the scene and open up useful avenues for inter-band comparison.

Temporal

Temporal resolution is resolution in time. This is also called revisit time or cadence. As mentioned above, temporal resolution for medium-resolution open data survey-style satellites (Landsat 8 and 9, Sentinel-2A and 2B, Sentinel-1A, and others) is typically around two weeks per satellite or one week per constellation. For weather satellites (with very low spatial resolution) it can be as quick as 30 seconds in certain cases. PlanetScope and many low spatial resolution science satellites are approximately daily.

High-res commercial satellite constellations are a special case, because, as we’ve seen, their collections are based on tasking. This means that if there’s some point that they never have a reason to collect, their actual revisit time might be infinite. If there’s a major geopolitical crisis and every possible image is taken, even from extreme angles, it might be more often than once a day. Realistically, over moderately populated areas of no special interest, it might be once or twice a year; in deserts, it might be multiple years.

Radiometric

Radiometric resolution is often overlooked, but it’s especially interesting to OSINT. It’s essentially bit depth: the number of levels of light (or other energy) that the sensor can distinguish in a given band. Older or cheaper satellites might have a radiometric resolution of 8 or 10 bits; newer and better ones are typically 12 to 14.

High bit depth opens up many possibilities – for example:

  • You can stretch contrast to account for obscurations like haze, thin clouds, and smoke.
  • You can stretch contrast to find extremely faint traces on near-homogeneous backgrounds: wakes on water surfaces, paths on snowfields, offroading by light vehicles. Initial testing suggests Landsat 9 OLI (which has excellent radiometric resolution) can pick up the tracks of single trucks on the Sahara, despite the tracks being made out of sand on sand and much smaller than a single pixel of spatial resolution. It can also pick up bright city lights at night.
  • Band math, such as calculating band ratios or distances in spectral angle, gets more stable and accurate.

In OSINT we usually can’t afford a lot of highest spatial resolution imagery. However, the excellent radiometric resolution of a lot of free data (since it was designed for science) gives us a side route into seeing things that someone hoped would not be noticed.

Radiometric resolution can be increased at the cost of spectral resolution by averaging bands. Under idealizing assumptions, the standard deviation of the noise of an image average is 1/sqrt(n), where n is the number of input images with unit standard deviation noise. (In practice, noise will be positively correlated between the bands of most sensors, so you’ll fall at least somewhat short.)

Another way to look at radiometric resolution is to think about the total signal to noise ratio, or SNR, of the image. Some of the noise is what we usually mean by noise – semi-random grainy or streaky false signals inserted into the image by sensor flaws, cosmic rays, and so on. But some of it will be quantization noise, a.k.a. rounding errors or aliasing: output imprecision due to the inability to represent all possible values of real data. This latter kind of noise is the problem that increases as bit depth goes down. (This is analogous to the idea of talking about effective spatial resolution as a combination of the sampling resolution and the point spread function being sampled. But we’re getting off the main track here.)

Modalities

A sensor’s modality is the form of energy it senses and the general principles it uses to construct useful data. For example, microphones are sensors whose modality is measuring air pressure to record sound, barometers are sensors whose modality is using air pressure to record weather-scale atmospheric events, and everyday cameras are sensors whose modality is measuring visible light to record focused images.

Optical

Here we’ll define the optical domain as anything transmitted by Earth’s atmosphere in the windows between about 300 nm and 3 μm. This includes near ultraviolet (here, “near” means “near visible”, not “almost”), visible, near infrared, and shortwave infrared light, but not thermal infrared. You might also see this range described as, for example, VNIR + SWIR – visible, near infrared, and shortwave infrared. We’ll use Landsat as an example again, since its OLI sensor (on Landsat 8 and 9) is well-known and fairly typical of rich multispectral sensors. Its bands are:

OLI and OLI2 bands[1]
Name Wavelength range in nm (FWHM) Primary uses Visible to human eyes
Coastal/aerosol 435 to 451 Deep blue-violet. Water is very transparent in this band, so it can see into shallows. Also picks up Raleigh scattering from aerosols, helping model atmospheric effects and distinguish clouds v. dust v. smoke. Yes
Blue 452 to 512 For true color. Useful for water. Better SNR than the coastal/aerosol band. Yes
Green 533 to 590 For true color. Chlorophyll (land vegetation, plankton, etc.). Around the peak illumination of the sun. Yes
Red 636 to 673 For true color. Absorbed well by chlorophyll. Shows soil. Yes
NIR (near infrared) 851 to 879 Reflected extremely well by chlorophyll and healthy leaf structures. Often the brightest band. No
SWIR1 (shortwave infrared 1) 1,567 to 1,651 Cuts through thin clouds well. Reflectivity correlates with dust/snow grain size – informative about surface texture. Note that this range in nm is 1.567 to 1.661 μm. No
SWIR2 (shortwave infrared 2) 2,107 to 2,294 Similar to SWIR1; some surfaces are easily distinguished by their differences in SWIR1 v. SWIR2. Flame/embers and lava glow strongly here. No
Pan (panchromatic) 503 to 676 Twice the linear resolution of all the other bands, since its wide bandwidth can integrate more photons at a given noise level. Used for pansharpening. This and the next are given out of spectral order. Yes
Cirrus 1,363 to 1,384 Deliberately not in an atmospheric window – almost entirely absorbed by water vapor in the lower atmosphere, but strongly reflected by high clouds. Allows for better atmospheric correction by spotting thin clouds. No

Band names are semi-standard in the sense that, for example, green will always means some version of visible green. However, exact bandpasses can vary quite a bit between sensors. Intercomparing bands from different sensors on the assumption that they must match will often lead to problems – check the actual numbers, not the names.

Bands can be processed and combined in many, many useful ways. For example, you can run statistics like principal component analysis on a set of bands to find correlations and outliers. You can use band ratios like NDVI, NDWI, or NBR, which index properties like vegetation health, surface moisture, and burn scars. You can treat multispectral values as vectors to be clustered, compared, or decomposed. You can derive a “contra-band” by subtracting some bands out of another band that covers them.

You almost always learn more by comparing bands than from one band alone. Features that are unremarkable in a single grayscale image can become meaningful if you notice that they don’t fit the usual relationship between that band and some other band(s).

True and false color

True color imagery puts red, green, and blue sensed bands in the red, green, and blue bands of the output image. It looks more or less like it would to an astronaut with binoculars. What’s called true color is often not quite, because the sensor bands don’t correspond exactly to the primaries used in standards like sRGB, but the difference is rarely important.

Humans have 30 million years of evolutionary hard-wiring and several decades of individual practice in interpreting true color images, and therefore you should favor true color whenever reasonably possible.

However, often false color is the way to go. This means putting anything but red, green, and blue bands (in that order) in the channels of the image you’re looking at. You might not even use bands directly at all; you might derive indexes or other more processed pseudo-bands. You could pull in data from another modality. Most often, however, people simply choose the bands that are most useful to them and put them in the visible channels in spectral order (i.e., the longest wavelength goes in the red channel and the shortest in blue). For any widely used sensor, a web search should give you a selection of “zoos” demonstrating popular band combinations – for example, here’s one for Landsat 8/9, but you can find dozens of others.

Band combinations are usually given by sensor-specific band numbers: 987 or 9-8-7 means band 9 is in the red channel and so on. (Annoyingly, this means that, e.g., Landsat 8/9 combination 543 and Sentinel-2 combination 843 are basically the same thing despite having different numbers.)

Pansharpening

Many sensors, including virtually all current-generation commercial data at about 1 m or sharper spatial resolution, have a spatially sharp but spectrally coarse panchromatic (pan) band and a set of spatially coarser but spectrally sharper multispectral bands. The nominal spatial resolution of the sensor will be for the pan band alone, and the multispectral bands’ pixels will be (typically) some multiple of 2 larger on an edge. For example, Landsat 8 and 9 have 15 m pan bands and 30 m multispectral bands (2×, linearly). The Pléiades and WorldView constellations have roughly 50 cm pan bands and 2 m multispectral bands (4×). SkySat, unusually, produces imagery (with some preprocessing) at 57 cm pan, 75 cm multispectral (~1.3×).

For visualization purposes, we combine panchromatic and visible data into a single image. As an intuitive model of this process, imagine overlaying a translucent, sharp black-and-white image (the pan band) onto a blurry color image (the RGB bands) of the same scene. You can actually do this quite literally and get a semi-acceptable result, or work harder to get a better result. “Real” automated pansharpening algorithms range from the very basic to the extremely sophisticated.

The point to remember is that most satellite imagery with good spatial resolution is pansharpened, and this creates some artifacts. In particular, when you are zoomed all the way in to 100% (pixel-for-pixel screen resolution), you have actually overzoomed all the color or multispectral information. Any pansharpening algorithm can only estimate a likely distribution of color. It’s like superresolution with neural networks – it may be statistically likely to be correct, it may be perfect in some cases, it may help you interpret what’s there, but it is necessarily a process of inventing information. And that entails risks.

Georeferencing and orthorectification

Much of this applies outside optical as well – move?

A raw satellite image of land is an angled view of a rough surface. (Even nominally nadir-pointing satellites acquires imagery that is off-nadir toward its edges.) If you imagine riding on a satellite and looking off to, say, the west, you will see the eastern sides of hills and buildings at flatter angles than you see the western sides – if you can see them at all. To turn a raw image into something that is projected orthographically, like a map, you have to use a terrain model – a 3D map of the planet’s surface. Then you can use information about where the satellite was and the angle its sensor was pointing, and for each pixel in the output image, you can project it out to see at what latitude and longitude it must have intersected the ground. Then you move all the pixels to their coordinates in some convenient projection, and you’ve essentially taken the image out of perspective and made it orthographic.

Except:

  • Earth’s surface is rough at every scale, and even “porous” or multiply defined in the sense that there are features like leafless trees that make it hard to define where the optical surface actually is at any given scale.
  • There is no perfectly accurate, precice, global, completely up-to-date terrain model of the Earth, let alone at a reasonable price. SRTM is pretty good but it’s only about 30 m, stops short of the arctic, and is 20+ years out of date: there are entire lakes, highway cuts, and reclaimed islands that don’t exist in it.
  • Satellites typically only know where they’re pointing to within the equivalent of about 10 pixels (which, to be fair, is usually an extremely small fraction of a degree), so the pointing data can only narrow things down, not actually tell you where you are.
  • Continental drift means that a continent can move by easily 1 px over the lifetime of a high-end commercial satellite; a major earthquake can discontinuously distort a small region by several m.
  • To properly pin down an image (i.e., to check the reported pointing angle), you need to know the exact 3D location of 3 visible points within it, and realistically more like 10.
  • All these errors can combine.
  • No matter what, you can’t recover occluded features, i.e. things you can’t see in the original data. If you want a high-res satellite image of something like a canyon, you realistically need half a dozen images at very specific angles, which is extremely hard.

We could go on! Georeferencing and orthorectification is a difficult problem. It’s easier for lower-resolution satellites, because a given angular error comes out to fewer pixels. Also, survey-mode satellites like Landsat and Sentinel-2, which are nadir-pointing anyway, put a lot of effort into doing this well. Two Landsat scenes will almost always coregister to well within a pixel. Sentinel-2 is a little less reliable, especially toward the poles. Commercial imagery is often displaced by far more than you would think. One way to see this is to step back in Google Earth Pro’s history tool, especially somewhere relatively remote and rugged.

Here’s a farm in Nepal: 28.553, 84.2415. Just step back in time and watch it jump around underneath the pin. If you really want to be scared, watch the cliff to its north. This is why imagery analysts who understand imagery pipelines rarely use a whole lot of significant digits in their coordinates! You don’t really know where anything on Earth is, in absolute terms, to within more than a few meters at best if all you have to go on is a satellite image.

Atmospheric correction

Over long distances, even in clear weather, the atmosphere scatters and absorbs light. This is why distant hills are low-contrast and blueish (blue light is scattered more). What a satellite actually measures is called top-of-atmosphere radiance, or TOA. This is a measurement of nothing more than the amount of energy received per second, per pixel, per band. It can be measured pretty objectively. However, it’s often not what you want. For one thing, it’s too blue. For another, the amount of blueness and related effects will vary semi-randomly with atmospheric conditions (humidity, maybe dust storms or wildfire smoke, etc.) and predictably with season.

Therefore, a reasonable desire is to basically normalize the sun and remove the effects of the atmosphere. What we’re trying to model here is called surface reflectance (SR). The main issue is that we don’t know the true state of the atmosphere at the moment the image was acquired. The best we can do is to model it and subtract it out. This is one of the problems in remote sensing, and you could earn a PhD by improving one of the major models by a few percent.

The good news is there’s a brutally simple method that works pretty well most of the time. Dark object subtraction means assuming that the darkest pixel in the image should be pure black. Therefore, if you subtract out however much blue (and green, and so on) signal is present in the darkest pixel, you will have canceled out all the haze. It’s annoying how well this works considering how basic it is. It’s roughly equivalent to the auto-adjust tool in an image editor like Photoshop, or, to be a little more exact, a little like using the eyedropper in the Levels tool to set the black point to the darkest pixel.

Correction to reflectance may or may not attempt to correct for terrain effects (i.e., relighting the scene). Different pipelines have different conventions for how far to correct or what to call different kinds of correction.

Atmospheric correction is usually not key for OSINT purposes, but any time you find yourself taking exact measurements of pixel values, you should at least know whether you’re working in TOA or in SR, and if SR, you should have a sense of what the pipeline was.

Common optical sensor types

This section is a stub. Please start it!

  1. Pushbroom
  2. Whiskbroom
  3. Full-frame

Thermal

This section is a stub. Please start it!