Saturday, February 18, 2017

Total Solar Eclipse 2017

I live in a pretty cool place, from a certain nerdy viewpoint. 119 species of birds, two years running, might indicate a certain predictability, but a closer look at the data destroys that notion.

So how cool is this place, really? My subjective measures include things like species counts, whether I can get reasonable photos, etc. Subjective in this case means entirely subjective. So what might tip this place into Coolest Place I Have Ever Lived?


Yeah, that might do it. Though no photo did it justice. Film doesn't have the dynamic range, and digital cameras are worse in that respect. There's a wide pearlescant glow from the solar corona seen IRL which is entirely missing from short exposure times. This image predates optimizing over a set of stacked images.

It's a photo of a photo that has been hanging on some wall of pretty much every place I have ever lived since 1979. Yes, I am an old fart: deal with it. No, it's not related to Sauron in any way, save perhaps as being inspirational to film-makers for major production houses. Possibly. The mechanics of of how films are actually made (and taxes, payments to the Tolkien estate avoided, etc.) entirely escape me.

You may want to visit https://en.wikipedia.org/w/index.php?title=Solar_eclipse_of_February_26,_1979&oldid=761573206

That's the link as of this date: I've been burned by not specifying specifying dates. Pull quote:

Many visitors traveled to the Pacific Northwest to view the eclipse,[1] since it would be the last chance to view a total solar eclipse in the United States for almost four decades. The next over the United States will be the total solar eclipse of August 21, 2017.
Although the path of totality passed through Portland, Oregon in early morning, it was not directly observable from the Portland area due to overcast skies.[2]

That last line matters. In 1979 I was driving up the Columbia Gorge, seeing small holes of blue sky in wide overcast, and trying to judge where said blue holes might line up with the sun, during that brief period of totality. A fast car and a certain disrespect for law and order won the day. Everything lined up, and I skidded to a stop at Horsethief Lake State Park in time to see the whole event.


  • Shadow racing through the gorge at a thousand miles per hour
  • Weird greenish light, entirely unexpected, before totality
  • Shadow bands rippling across the ground.

It was awesome, in the original sense of the word. This is the Pacific NW. I saw Mount St. Helens erupt a year later, so I'm not a stranger to drama.

So here we are, that long 40 years later, as mentioned in the above pull quote. I'm now a certifiable Old Fart who never expected to live this long. But that narrow path of totality will sweep directly over my place  on August 21. In place of vile winter weather,  I have the best weather of the year, and all I have to do, essentially, is walk outside. How cool is that?

Being a complete nerd, I'll go bit further. I'll spread my parachute canopy across the yard below a second-story deck, and hope for a shadow band photo opportunity, etc. But mostly, I just want to experience the event. I lack the words to describe a total solar eclipse. Perhaps that is the true meaning of 'awesome': you just can't really express it.

Of one thing I am certain: on 2017-08-21, this weird little place in small-town Oregon will become The Coolest Place That I Have Ever Lived.

Saturday, February 11, 2017

Exploring Data From the Linux Command Line

A few days ago, we saw the first signs that perhaps the worst of an unusually cold and wet winter might be ending: a temperature over 60°F!

A neighbor commented was made that it had been a long time since the last one, and I was curious as to exactly how long. For reasons of my own, I keep data files on what's recorded at the nearest weather station with what I consider to be fairly reliable data. So it only took a couple of minutes exploratory hacking around at a shell prompt to get my answer. Here’s what I did, and the result I got.

grep ^161[0-2] 1606010000-1612311600 | awk '{print $1" "$5}' | grep -E 6.{3} | tail -n1
1611191600 61.0

It seems longer, but the last day of ≥ 60.0°F temperature was 2016-11-19, and as a side-effect we also get the last time of the last day: 1600 (4PM for those of you who don't use 24-hour time). We could get rid of that side-effect; they are usually a Bad Thing in code. But in this case the source is obvious (as will be shown below), and entirely beneficial. It extracts another piece of information from our data at zero computation cost. Exploratory code for the win.

Before I get into what our pipeline is doing, a note about the file. These are raw data - fields are separated only by whitespace. Lines begin with time and date encoded as YYMMDDTTTT. Hence the first field meaning of the result seen above, and the file name 1606010000-1612311600. It reflects the start-stop dates and times of the file. That can be a useful convention: in this case it immediately reveals that the data are incomplete. The station failed to record after 1600 on New Year's Eve.

Additionally, we can use the wordcount program in linecount mode to see that we are starting with a file containing 5541 lines (records, though there is a 3-line header, which I won't bother to filter out).
wc -l 1606010000-1612311600
5541 1606010000-1612311600

1- grep ^161[0-2] 1606010000-1612311600, in which grep (a pattern-matching tool) is supplying all lines (records) from our file that begin (specified via ^) with 161, if the next digit is 0-2. I was only interested in months 10-12 of 2016 (and 2016 data are all that is in this file), because I knew the last date of ≥ 60.0°F would be in there somewhere. We now have only records from our period of interest. If we ended here, our output would be 
1610010000  24.10    3.0  163.0   53.0   51.0   80.0   12.9  204.0    6.0    0.0
...
1612311600  48.70    2.0   99.0   35.0   35.0   87.0   13.3  173.0    8.0   47.0

I'm using the ellipses in place of 2608 lines of output. wc -l shows 2610. We've filtered out nearly half of our data. Now we pipe (the | character) those lines to awk.

2- awk '{print $1" "$5}', where we instruct awk (a pattern scanning and processing language, of which more later) to print only that first datetime field, a space, then field 5, which contains the temperature, of each line of input it receives. Now we're down to only the fields of interest within our period of interest.  Had we stopped here, our output would still be 2610 lines, but only 2 fields out of 11, formatted as
YYMMDDTTTT NN.N.

This 2nd stage of our filter removed about 2/3 of its incoming data. I'm just guesstimating by looking at line lengths here, but you can get accurate numbers using wc again, before and after this stage. Specify -b instead of -l to count bytes instead of lines. I'll skip the demonstration. Now we send that on to grep again, but specifying different options.

3- grep -E 6.{3} contains the -E (Extended) option, which enables the {} syntax so that we can specify how many instances of a character we want to match. The preceding dot can be read as 'any one character', so a multi-character string would not match.  The trailing '$' matches the end of line -- the opposite of the '^' we used the first time we used when we piped to grep. The net effect is that only content that matches a '6' followed by any 3 single characters, followed by end-of-line, will survive. Given our NN.N format for the field field, we filter out anything except 6N.N and wc -l now shows only 222 of those short lines left, of 2610. Having filtered out all but 1/7 or so of the data coming into this stage, we now we filter down to one line - our answer.

4- tail -n1, which returns only the last n lines, and specify n=1. Because the data are in increasing time/date order (as can be seen in the output of our first filter) this gives us our last datetime, and answers our question, with greater accuracy than we had thought to ask.

If we needed the date and nothing but the date, we could modify our usage of awk, which is a pattern scanning and processing language. GNU awk has some very interesting capabilities, such as floating point math, true multidimensional arrays, etc. This entire task could have been done in awk, but I wanted to show more of the shell tools, and pipelines, not just Cool Things We Can Do With GNU awk'. [1]

The Shell Will Probably Always Belong in Your Toolbox

I often use far more sophisticated tools when I want to take a long hard look at data. But, file formats vary, data may be missing, etc. As a rule of thumb, you can expect to spend half of the total time spent analyzing data just seeing what's there, and cleaning it up. For much of that work, the shell is a great tool, and it's actually very common to spend a bit of time using the command line to explore. In a broad view, command-line tools can help you determine,  quickly, whether a particular data source contains anything of interest at all, and if so, how much, how it's formatted, etc. And finally, the commands can be saved as part of a shell script, and used over an arbitrary number of similar data files. 

To a point, anyway. Shells are slow (particularly bash). Though of course there are tools to quantify that as well, and timing work on a subset of the data can give you an idea of when you are going to have to use something else. 'time' is available as a built-in if you are using the bash shell, and any Unix or Linux will also have a 'time' binary somewhere on your search path if the appropriate package is installed. On this machine it's /usr/bin/time, packaged as 'time'. Everything else, except the shell itself, is in the 'coreutils' package. Which says something about how useful these tools are. If you aren't using them, you quite literally are not using the core of the Linux/Unix tools. 

That is probably a mistake. There is a lot of data out there, stored as textual files of moderate size.

My Ulterior Motive for This

I wanted a post such that:
  1. I could advocate the command line, to people who seem to inappropriately default to spreadsheets, which are nothing more than another tool in the box. That box should contain several tools. Consider unstructured data. Or consider binary data formats, which are an intractable problem for both shells and spreadsheets.
  2. Had absolutely nothing to do with security work. Because people are going to be justifiably sensitive about exactly whose security data I might be using as an example. But everybody talks about the weather.
If anyone wants to play with the data, it's available at:
https://drive.google.com/open?id=0B0XLFi22OXDpR3h0UUQ1cmNWbkk
Note to self: find another home for this sort of thing. Google Drive can't even preview a text file.
Note to all: this is not a promise to keep it there for any significant period of time. If I need the space for other things (like client-related things), that file is very, very gone. I recently VVG'ed most of what was in /pub.

[1] I do have one idea for something I'll do with awk one of these days. Because who doesn't like univariate summary statistics combined with 4000 year old Babylonian math, and using NIST-certified results to validate (or invalidate, as the case may be) our code?

Tuesday, September 20, 2016

Greater Yellowlegs

This August, I didn't find nearly the number of species of birds that I did in 2015. This month is also a lot slower. Unlike some birders, for whom the list length is everything (insert obvious crude comparison here), I'm fine with that. Species counts are just another tool I use to try to understand what is going on, on my patch. Counts happen to be a powerful tool, if used well, but it's about understanding, not competition.

Here are a couple of Greater Yellowlegs (a sort of large sandpiper) that I don't see enough of. Work can be a pressure cooker environment, the recent news reports are usually depressing, etc. Being able to walk out the back gate, go down to the river and see a couple of neat birds and fall color, reflected in late-summer low river levels, is a welcome break.

Well. That either matters to you, or it doesn't. If not, I hope you have some other means of coping.

Greater Yellowlegs, Willamette River, Linn C, OR, 2016-09.04



Friday, August 26, 2016

Does work really expand to fill all available hours?

That might be a perception issue. In a second effort (this week) to free up more time, I just invested half an hour to run an optimization experiment. Amazingly successful and I'll probably save 4-5 hours per week, for a month or more. Huge win, to be sure. Counting both efforts, I get 6-7 hours back.

The thing is, I didn't start really looking for optimizations until I passed a pain threshold. I expect that is pretty typical behavior for us all, and that really sucks for me, on a couple of levels.

First off is professional. Always optimizing stuff is part of the gig.

Second is just personal embarrassment, because missing a forehead-slappingly easy test for bias, is, well, personally embarrassing.

That bit of folk wisdom, that work expands to fill all available hours? Like much folk wisdom, not buying it. This was just the most recent iteration of the problem. I think it's much more about pain thresholds, and when we finally realize we can't fit that next Desired Thing into the schedule. Only then do we scurry off and find fixes for the problem.

Perhaps this a LifeHacking thing. Hard to tell: trying to follow whatever fashion is currently playing out on the Internet is usually an expertise in futility.

But I plainly need to lower my pain threshold, and optimize sooner.

Monday, May 2, 2016

Taking a Break with a Bald Eagle

After an 0-dark-thirty start to Monday, I was was ready for a break by mid-morning. Grab a fresh cup of Productivity Fluid, and out onto the deck. That deck is on the second story, and faces a large Black Walnut, the slope down to the river bank, etc. It's a bit of a habit to grab binoculars and a camera on the way out. This morning, the idea was to do a quick bird count, for submission to eBird. Because it's the height of Spring migration, and weird things happen.

And indeed they did. About 40 feet away, in that Black Walnut, were Wood Ducks. Which actually have clawed feet, perch in trees, etc. Hence the name. Photos didn't work out too well. These birds are usually shy, with good reason: there are lot of hunters on the river during the season. Typically, I seem them from a couple of hundred feet away, as the are headed elsewhere. Bummer about the photos, though. Drakes are almost cartoonishly colorful. But the drake had a branch between us, and it was obvious that if I moved much, they were going to spook.

And ... they did.

A few minutes later, an immature Bald Eagle flew into the same tree, and pretty obviously was not worried about me at all. There was a lot of ray-catching and preening involved. Here's the bird pausing from a bit of luxurious back-preening to make sure the silly human isn't doing anything, well, silly.


And of course, I had to take the obligatory head photo. 


That bird hung out for at least two hours. Seemingly just enjoying the morning. I, unfortunately, had to get back to the salt mines. An early start already looks like it will extend into a late night.

How's your day going?

Wednesday, April 27, 2016

Green-hued Purple Finch

April is drawing to a close. It's a been a great (for small values of great) month on the birding front. I added half a dozen or so species to the April all-years list at my local patch. Which is currently wedged at 99, and seems likely to finish that way. So close, and yet so meh. I'm chalking that up to a somewhat early migration for much of the year so far. Which seems to be ending.

This blog is very much not the place for accounts of the truly rare. Odd, I can occasionally do. My patch is a bit different, in that many local birders see White-crowned Sparrow, while I see White-throated, etc.  Another difference lies in Purple Finch. Which, for whatever reason, seem more common here than what is typically seen along the Benton/Linn Co. (Oregon) border. Common enough that I get to see PUFI (4-letter banding code for Purple Finch),  see the cannonical USGS reference for the whole thing, in unusual plumage.

This group is a bit prone to weirdness. I have photos of House and Purple Finch in hues that might be best described as golden, rather than red/rose/purple, and I've seen references to that being a function of diet. But green is a bit off-the-wall, in my experience. Here is the only green-hued Purple Finch I've ever seen, and that was on 2016-04-01. April Fools Day. No way was I going to post that the day I saw her.



But perhaps not so outlandish as all that. A Web search found one reference, the Purple Finch entry in John J. Audubon’s Birds of America, which seems to indicate that this hue can be common, at least toward the eastern US. OTOH, that was a long time ago, far from my patch in Oregon's Willamette Valley, and Audubon was, well, a bit dubious in some respects.

What do modern field guides have to say? In alphanumeric order, looking for any reference to 'green' I found the following.

  • National Geographic Field Guide to the Birds of North America: no mention.
  • Peterson Field Guide to Western Birds, 2nd edition: no mention.
  • Peterson Field Guide to Western Birds, 3rd edition: no mention.
  • Sibley Guide to the Birds, 1st edition: "Pacific females are washed greenish above..."
  • Sibley Guide to the Birds, 2nd edition: "Pacific females are washed greenish above..."
  • Stokes Field Guide to the Birds of North America: no mention.

Does this, in any way, constitute a recommendation for a field guide? Well, no. Aberrant golden hues are, in my limited experience as a patch birder, far more common amongst House and Purple Finch. I've seen dozens of golden-hued birds of each species, and exactly one greenish finch. Yet golden birds get no mention at all.

Does that mean that I regard popular field guides as equally wrong? Well, no. Tremendous effort was expended by very talented people in creating these guides. A lot of financial risk was assumed by all parties -- including publishers. Personally, I doubt that the vagaries of plumage variations can ever be adequately described in a field guide. Not least because human languages cannot adequately describe color. Ask a fly fisher what 'dunn' refers to.

I confess that Sibley is my favorite, but this is not an example of why.












Saturday, April 16, 2016

Killdeer are Laying

Springs rolls on. While down along the river this morning, I heard my first House Wren of the year (a nesting bird), and found the Killdeer were already at it. As you can see, Killdeer don't go in for nest building --  just a simple scrape. But the eggs aren't actually that noticeable; one could step on them quite easily, and watching where you are walking is advisable. Luckily when Killdeer are in the area, you will likely know it due to their strident alarm calls, and hair-trigger sensitivity, occasionally feigning a broken wing to lure predators away from the nest, etc.

Killdeer nest and eggs

Their appearance is also distinctive. People that have essentially no knowledge of birds often recognize them when shown a photo: they just didn't know what they were called. So here's a photo. The Killdeer is the bird on the left. The bird on the right is a Greater Yellowlegs. This photo was taken April 3, but is likely one of the same birds involved with this nest. I'm using this image because it contains that Yellowlegs. That is a matter for another post, but this way I get to use the same image, and I'm lazy.

Killdeer and Greater Yellowlegs

In addition to not stomping up a beach like a Marine going into combat, likely crushing eggs, there are a couple of other guidelines related to nests that you might want to be aware of. Simple things like how to minimize the disturbance you cause. All About Birds has already posted one, so I don't have to.

I specifically wanted to limit my time at this nest, on this day. I had already seen three species of corvid (Steller's Jay, Western Scrub Jay, and American Crow). Corvids are infamous nest-robbers. The tape measure is always with me in the field -- it's not like I had to go get it and return, causing two disturbances. My total time at the nest was 1:38 from first to last image according to Exif data, and laying down a tape obviously took only seconds. One other thing I did was place a couple of rocks to point to the site. My total time was still under 2 minutes.

With the rocks, I can find it again from a distance, and have no need to closely approach it. The normal clutch size is 4-6 errors, so these birds just started. Incubation period is 22-28 days. In a month or so, I should see tiny little fuzz-balls on the beach. They can walk away from the nest as soon as their feathers are dry, and they're fun to watch.